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Predictable Aspects of Clerical Work * 


Roger T. Lennon 
Research and Test Service, World Book Company 


and 


Brent Baxter 
Ohio State University 


The usual procedure in evaluating the efficiency of aptitude tests in 
the prediction of clerical work is to correlate the test scores with some 
measure or rating of success in clerical work. Few, if any, attempts, 
however, have been made to determine what specific aspects of clerical 
work are related to scores on the types of tests most commonly used in 
selecting clerical workers. This article describes one such attempt, 
carried out in connection with the usual-type investigation of the validity 
of two tests used for predicting clerical success. The study described 
was conducted in a large government agency employing several thousand 
clerical workers. The plan was to have a group of clerical employees 
rated on a series of specific statements concerning different aspects of 
clerical work, and to note on which statements ratings were related sig- 
nificantly to test scores. 

‘The Tests. The tests involved in this study were as follows: 

a. Learning Ability Test. This is a local adaptation of Army Alpha, 
Kansas Revision (by Schrammel and Wood); it includes six subtests: 
arithmetic problems, common sense, same-opposites, rearranged sen- 
tences, analogies, and general information, requiring 27 minutes working 
time in all. It correlates .88 with the Otis Quick-Scoring Test of Mental 
Ability-Gamma Test. ‘Learning Ability Test” is the designation used 
locally, and hereinafter, to refer to this test. 

b. Clerical Aptitude Test. This is a test developed by the authors to 
measure speed and accuracy in the performance of simple clerical tasks. 


* Acknowledgment is gratefully made to Miss Evelyn Potechin for assistance in the 
statistical computations. 
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It is also composed of six subtests: alphabetizing, number checking, 
coding, digit counting, arithmetic computation, and locating information 
from tabular data. The test requires 21 minutes of working time. Cor- 
relations of .67 with the Civil Service Commission Clerical Examination, 
and .65 with the Learning Ability Test have been obtained. 

Both tests were administered to practically all applicants for clerical 
positions in the agency at the time of employment, but virtually no 
applicants were rejected on the basis of the test results. 

The Subjects. The employees chosen for the study included all who 
had the following qualifications: (1) had taken both tests at the time of 
employment; (2) had been on the job for at least three months; and 
(3) were doing fairly simple clerical work, involving no supervisory re- 
sponsibilities. The group originally chosen for study numbered about 
250, and comprised about 50% clerks of various kinds, 30% stenographers, 
10% typists, and 10% miscellaneous workers such as clock card checkers 
and bookkeeping machine operators.! 

The Check List. One of the criteria which was used to evaluate the 
tests as predictors of clerical success was a so-called Clerical Check List. 
This Check List consisted of a series of statements concerning the work 
of clerical employees. The method used in developing the Check List 
followed in many ways a method originally proposed by Probst.? The 
validity of this method as a technique for measuring success in clerical 
work is not discussed herein, since this article deals only with the 
individual items in the List, and not with any composite rating. It is 
not assumed that the statements in the List are independent. The 
entire Check List appears below, together with the directions for filling 
it out which accompanied it. It will be noted that the List includes both 
favorable and unfavorable comments on the quality and quantity of the 
employee’s work. The items are both specific and general; that is, some 
statements are pertinent to only a specific task, such as the transcribing 
of dictation notes, while other statements pertain to characteristics such 
as neatness, orderliness, etc., which apply to all kinds of work. 


Check List for Clerical Employees 


Directions: Below is a series of statements which have been made about 
clerical employees. Read each statement carefully, and decide whether or not 
it is a true statement about the employee under consideration. Then put a 
mark in the margin at the left of the statement in accordance with the diree- 
tions below. 

a. If you believe the statement is true, put a plus sign (+) before the 
statement. 


1For the information of those familiar with the classification system in Federal 
agencies, these subjects may be identified as grade CAF-1, 2, or 3. 
? J. B. Probst, Fersonnel, 1931, 8, 20-24. 
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b. If you believe the statement is not true, put a “0” before it. 

c. If you do not have sufficient knowledge about the employee to decide 
about the statement, mark it with a question mark (7). 

d. If the statement is not pertinent to the work the employee is doing, 


mark it with “NP.” 


For example, if the statement says something about 


typing, and the employee being considered does no typing, mark the statement 
with “NP.” 


e. Place one of the above marks at the left of each statement. 


There may 


be some special event which has helped you to decide about a statement or 


some special information. 
column under ‘‘Comments.”’ 


In these cases do not hesitate to note it in the 
Code for Marking Statements: True = +; 


Not true = 0; Don’t know = ?; and Not pertinent = NP. 








Mark 
here Statements 

pens Lise 1. Uses the correct printed forms for each operation in his work. 

Rcccssatte 2. Checks on his work have revealed that he makes very few errors. 

icuahodaiitl 3. Can arrange tabulations and statistical memoranda in an easily 
understood form. 

ges .. 4. Attends systematically to matters that require periodic attention. 

mek d OS 5. Keeps his records up to date. 

rab en 6. Is alert to notice the improper routing of a file or letter. 

cities. 7. Gets important letters out before the end of a day instead of holding 
them over night. 

Serpe 8. Checks his work for errors before releasing it. 

bdr i 9. Accurately transfers information to forms. 

detinonele 10. Has overlooked obvious errors in the work he is handling. 

ieee 11. Does a fair share of the work in his unit. 

saisiaiaens 12. Often does necessary but unrequested work on his own initiative. 

sbiihadnagel 13. Transmits messages correctly and intelligibly. 

pares 14. Has had to be corrected for the same error more than once. 

Linsibieibiad 15. Has sufficient knowledge of the work he is handling to detect ob- 
vious errors. 

RE RBS 16. Rarely misspells a word. 

pink 17. Some of the stencils cut have had to be retyped. 

pee 18. Sometimes forgets matters which should receive prompt attention. 

meh ode. 19. Sometimes gets lost in routine and forgets important details. 

Prete 22) 20. Knows where to go for necessary information. 

Saleccicae 21. Is accurate in computations and calculations. 

ee 22. Takes the easiest way when it means the results will be incomplete 
and not wholly satisfactory. 

dae ied 23. Work has sometimes had to be assigned to others because he is slow. 

nnsigilecote 24. Can, when required, compose a letter or memorandum that covers 
all pertinent points and presents all necessary details. 

wi bail 25. A “peak” in work to be done discourages him to the detriment of 
his work. 

Art re Raha 26. Resumes work promptly after an interruption. 

Fedde oe 27. Pee in letters which reveal errors in English which were over- 
ooked. 

anil 28. Tries to avoid added responsibilities. 

cchcapilel 29. a less work than other people in his unit doing the same kind of 
work. 

iil 30. Correctly compiles material. 

REE + SIS 31. Typed letters are occasionally marred by erasures or strikeovers. 
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Statements 





. Is eager to learn more about the work of his unit. 

. His working instructions have to be repeated frequently. 

. Avoids unnecessary duplication in his work. 

. Sometimes fails to have adequate supplies and materials on hand. 
. Has had to do some of his work over because he has started before 


clearly understanding what is to be done. 


. Carbon copies have sometimes shown folds, creases, smudges, etc. 
. If he is assigned more than one job at a time, he becomes confused. 
. He is familiar with the relation of his work to other work in the 


branch. 


. Has started some new assignments without a clear understanding of 


what is to be done. 


. On his own initiative he has obtained outside training which has 


roved his effectiveness as a worker. 


-C attention to possible errors in materials before transcribing 


them. 


. Does a fair share of the more difficult tasks within the unit. 

. Checks on his work reveal that he makes no mistakes. 

. The speed of note-taking (dictation) is as fast as the work demands. 
. Has used poor grammar in typing letters from general directions. 

. When he has to speed up because of a peak load, there is an increase 


in his percentage of errors. 


. Has made helpful suggestions about work handled. 
. Some work io had to be returned to him to be done over because 


of the quality of the work. 


. Copies of work show an uneven typing touch. 
0 


Reviews his work for discrepancies before turning it in. 


. Correctly uses the terminology of his job. 

. Difficult tasks challenge rather than confuse him. 

. Has sometimes made the wrong number of copies. 

. Has misdirected mail when sorting it for office distribution. 

. Is often entrusted with tabulation of difficult materials. 

. Quickly learns where and how to locate people whom the supervisor 


often calls. 


. Returns all letters, records, files, etc., to their proper place promptly. 

. Distracts or interrupts a dictator unnecessarily. 

. Is confused by sudden changes in assignment. 

. Bothers others by asking them how to spell words. 

. Typed work is always orderly and well aligned. 

. Does not turn out as much work as the other members of the unit. 
. Grasps new rules, regulations and procedures quickly. 

. Is unable to carry on effectively more than one task at a time. 

. Is an employee to whom the most difficult assignments can be given. 
. Works equally well with or without close supervision. 

. Dictators have to slow up because of his inability to take dictation 


fast enough. 


. Produces work of acceptable quality even under pressure. 
. Accurately ascertains the business of visitors and directs them 


correctly. 


. Adapts _ ee easily and quickly when given a new assignment. 
. Cannot take full SS notes so that other stenographers can 


transcribe from them if necessary. 
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Mark 
here Statements 

ttl 73. Checks and corrects such items as names, titles, addresses, file ref- 
erences, dates, etc., before releasing correspondence. 

Wicieding 74. Desk and files are neat and orderly. 

pireetan 75. Can analyze records quickly so as to locate needed information. 

tim 76. Is frequently asked to make up special memoranda. 

capikdediall 77. Does not always set up letters in accordance with regulations. 

nites 78. Is able to maintain the average rate of production of the unit. 

cnnesdieiial 79. Does not detect inconsistencies in the material typed or transcribed. 

édiinesea 80. Reminds the supervisor of his appointments if he seems to have 
overlooked them. 

Soe 81. Does not remember frequently-used names and numbers. 

osiiiedeiad 82. nag 0g of his slowness you sometimes hesitate to assign some jobs 
to him. 

Pie tess 83. Has a a pec ability to compose acc »ptable letters or mem- 
oranda. 

asthiahabaodl 84. In transcribing dictaphone rolls, can make reasonable interpreta- 
tions where the roll is not clear. 

deseginnl 85. Passes on blame for own errors. 

Bienen 86. Is able to concentrate on duties despite the frequent interruptions 
which come with acting as receptionist. 

ceemtesial 87. Has been known to make unnecessary duplication in his work. 

asvededil 88. His working instructions have to be repeated. 

Joe 89. In the use of carbons with forms, the carboned material appears in 
the proper spaces. 

mcisdibiatal 90. Is inclined to sacrifice accuracy for speed. 





Copies of the Check List, with directions as shown below, were sent 
to supervisors of the group selected for follow-up. 


Directions for Supervisors Marking the Check Lists 


1. The Personnel Section is making a study of the effectiveness with which 


its aptitude tests predict the efficiency of its clerical personnel. 


In order to 





complete this study the Section is requesting that supervisors mark Check Lists 
like the one attached for some of their employees. Only those clerical em- 
ployees who have been tested and who have been on the job for at least three 
months are being studied. These records will not become part of the perma- 
nent files nor will they affect the Civil Service efficiency ratings. 

2. It is urged that careful attention be given to checking these lists for 
otherwise they are worthless. Please read and follow carefully the directions 
on the first page of the Check List. It is suggested that all of the statements 
on the List be read before any of them are marked. About fifteen minutes is 
required to complete a Check List. 

3. The name of the employee to be checked is in the upper right hand corner 
of the List. The name of the supervisor checking the List should be written 


in the upper left-hand corner of the first page. The number of weeks durin 


which the work of the employee has been reviewed by the supervisor indicat 
should be written beneath the name of the supervisor. 

4. It is believed that there is no need to discuss the Check List with the 
employee unless it is specially desired. 
until after the checking has been done. 
; ~ Please return these Check Lists completed to your Branch Chief within 

our days. 


In any case it should not be discussed 
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No supervisor was asked to complete Check Lists for more than four 
employees and most supervisors had to mark only one. Every effort was 
made to have the checking done by the immediate supervisor. Most of 
the reporting supervisors had about fifteen employees under their direc- 
tion. As the Check Lists were returned, each one was reviewed to see 
that the marking had been done according to directions, and that the 
proper supervisor had marked it. Of the Check Lists sent out, 80% were 
returned. The chief reasons for failure of all the Check Lists to be 
returned were separations of the employees or the supervisors, and the 
transferring of employees to different supervisors. 

It is realized that highly accurate ratings were not in all cases ob- 
tained from this group of supervisors, many of whom were untrained 
in checking the performance of employees. The supervisors’ checking 
was internally consistent, however, the correlation being estimated as at 
least .85. Some evidence was obtained on the agreement between super- 
visors on the total Check List score. A group of 62 employees were rated 
by two supervisors, the correlation between pairs of supervisors’ ratings 
being .69. No evidence is available on the reliability or validity of the 
marking of the individual items. Since these Check List ratings leave 
much to be desired as a criterion, any findings as to the number of items 
predicted significantly by the tests are likely to err on the conservative 
side. 

Results 


After the Check Lists had been filled out by the supervisors in accord- 
ance with the procedure described above, an analysis was undertaken to 
determine which of the items were significantly related to scores on the 
tests given at the time of employment; that is, which ones could have 
been predicted, to some extent at least, on the basis of test scores. From 
the total group for which Check Lists were completed, a group with 
high scores and a second group with low scores on the Learning Ability 
Test, constituting the highest and lowest 27%, were selected. The 
number of cases in each group was 58. A tabulation of responses to 
each item was made for these two groups, yielding for each item the 
number in the group for which a “+,” “0,” “NP,” or “?” had been 
recorded. The results of the tabulation appear in Table 1. 

The statistical analysis in this paper deals only with the proportion 
of the high and low groups for which each item was marked “+,’’ ie., 
true. The proportion considered was the relation of the number in the 
group having the item marked “‘+” to the sum of the “+” and “0” 
markings. Thus allowance was made for the ‘‘?”’ responses, reflecting 
inadequate knowledge on the part of the supervisors, and for the “NP”’ 
responses, so that the proportion for any item is based only on those 
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individuals for whom the item is appropriate—e.g., typing items for 
typists. It was sometimes found that the proportion of “NP” ’s marked 
for an item was significantly different for high and low scoring groups, 
reflecting differences in the kind of work assigned to low and high scoring 
people. Inasmuch as this study is interested in the prediction of quality 
and quantity of work rather than the kind of work, these differences will 
not be discussed here. 

The chi-square test was used to evaluate the significance of the differ- 
ences between the high and low groups, and those differences for which 
P was less than .10 were regarded as significant—in other words, these 
differences which were at the 10 per cent level of significance.* In the 
calculation of x?, Yates’ correction for continuity ‘ was uniformly used 
to provide a more refined test even when the numbers of cases were 
sufficiently large so that it was not strictly necessary. Use of this tech- 
nique is illustrated by the following example, which is the calculation of 
chi-square for Item 24 for the high and low groups on the Learning 
Ability Test. 





























High Low 
Score Score Total 
+ 20 11 31 
0 4 6 10 
Total 24 17 41 
41 \? 
an (mx6-ux4-4) * 


= .998 





5 24X17 X 10 X31 
For x? = .998, with one degree of freedom, P equals approximately .30. 
Therefore, according to our standard, this difference is not significant. 

On the basis of this analysis, 12 items were found to differentiate 
significantly between the high and the low groups on the Learning Ability 
Test. These items are indicated in Table 1. 

The same procedure was followed in determining which items were 
significantly related to scores on the Clerical Aptitude Test. High- 
scoring and low-scoring groups on this test, again constituting the highest 
and lowest 27% of the total group, were selected, and tabulations of the 
responses made for these groups. Because there is a substantial corre- 


3 In the actual computation of x’, it is not necessary to compute the proportions as 
such, since the calculations are performed in terms of frequencies; hence these propor- 
tions are not included in Table 1. , 

*See Goulden, C. H., Methods of statistical analysis. New York: John Wiley and 
Sons, 1939, pp. 102-106. 











Table 1 
Summary of Responses to Check-List Items for High-Scoring and Low-Scoring Groups on Learning Ability and Clerical Aptitude Tests 





Clerical Aptitude Test 


Learning Ability Test 








Lower 27% 


Upper 27% 


Lower 27% 


Upper 27% 


Group 








2: ee RE = Sig. 


0 NP ? 


aa 


ae 8 Sig. 


"P27 


Marking 
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ZAZAZRAZ 2NAZAAZZA ZnaZaged 


16 


42 
49 


55 
55 


18 


Item No. 


48 


52 


36 


18 


28 
12 


27 


32 


43 


12 
12 


41 


42 
36 
32 


23 
40 


21 
18 
48 
36 


14 
2 


52 
49 


19 


ZAAAZ 


23 
40 


24 
18 


16 
22 


i 


45 


51 


to] 


& 


11 


46 
11 


47 


49 


0 


46 


10 
ll 
12 
13 
14 
15 
16 
17 
18 


56 


57 


55 


2 25 19 8 


13 4 
16 


39 
39 


7 
27 


26 22 


15 


* 37 


26 


17 


38 


4 


1 


15 38 


49 


3 


1 


15 39 


48 


1 


49 


42 


25 
51 


1 


3 


10 44 
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49 


42 
41 


47 
45 


46 


55 


49 


51 
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lation between the Learning Ability and the Clerical Aptitude tests the 
highest scoring groups on the two tests included many of the same indi- 
viduals, as did the lowest scoring groups. It was found that 25 items 
differentiated significantly between the high scoring and the low scoring 
Clerical Aptitude groups, adhering to the same criterion of significance 
as above, viz., P less than .10. These items are indicated in Table 1. 

It will be observed from Table 1 that of the 90 statements covered in 
the Check List, 28 were found to be significantly related to scores on one 
or both of the tests used at the time of employment and hence are to 
some extent predictable through the use of these tests. Nine items are 
significantly related to scores on both tests; three are predictable by the 
Learning Ability Test but not by the Clerical Aptitude Test, and sixteen 


are predictable by the Clerical Aptitude but not by the Learning Ability 
Test. 


Discussion 


In order to see if any generalizations could be drawn about the nature 
of predictable items, as compared to those not predicted, the items were 
grouped as well as possible into several roughly homogeneous categories. 
While some items fell readily into the groupings, many statements were 
difficult or impossible to classify. Some seemed to fit almost equally 
well into more than one category. An attempt has been made to give a 


name to each group but the specific items themselves should be consulted 
to understand what is implied. From a study of the items as thus 
arranged, the following findings were derived: 


(1) Understanding of the work (Items 20, 33, 36, 40, 48, 64, 71, 81, 88): 
Of nine items in the List which dealt with this subject, eight were pre- 
dicted by the aptitude tests. 

(2) Errors in performance (Items 2, 6, 9, 10, 13, 15, 30, 42, 44, 47, 49, 
54, 55, 69, 70, 77, 79, 90): Of nineteen items dealing with the accuracy 
of the work, only three were predicted. 

(3) Quantity and speed of work (Items 11, 23, 29, 43, 57, 63, 66, 68, 
78, 82): All ten items in this group were predicted except Item 11, which 
was marked ‘“‘True” for practically everyone, and Item 68, which deals 
with a special skill (note-taking). 

(4) Performance of multiple tasks (Items 38, 65): Both items are con- 
cerned with the ability to handle effectively more than one task and both 
were ‘predicted. 

(5) Unnecessary duplication in work effort (Items 34, 87): Of the two 
items in this group, both were predicted by the Clerical Aptitude Test. 

(6) Typing (Items 17, 31, 37, 50, 62, 79, 89): No item dealing with 
the quality of typing work was predicted. 
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(7) Shorthand (Items 45, 59, 68, 72): No item was predicted. 

(8) Grammar and spelling (Items 16, 27, 46, 61): No item was pre- 
dicted. 

(9) Statistical work (Items 3, 21, 56, 75): No item was predicted. 

(10) Checking of one’s work (Items 8, 51, 73): No item was predicted. 

(11) Orderliness (Items 1, 5, 35, 58, 74): No item was predicted. 

(12) Attitudes toward work and “‘personality’’ traits (Items 25, 26, 28, 
32, 60, 85, 86): No item was predicted. 


Summary 


In a comparison of ratings on a 90-item Check List for Clerical 
Workers with scores on an intelligence test and a clerical aptitude test, 
the results indicated several aspects of clerical work which are “pre- 
dictable” on the basis of these tests, but pointed on the other hand to the 
need for separate tests in typing, shorthand, statistics, grammar, and 
spelling for positions in which these abilities are required. Even with 
this more complete battery, it is probable that only a portion of the total 
variance in on-the-job performance would be predicted, for it is not 
assumed that the Check List covered adequately all aspects of perform- 
ance. As might be anticipated, the aptitude tests did not predict any 
of the so-called “personality factors.’”” The Clerical Aptitude Test was 
efficient in predicting speed, amount, and understanding of work but was 


deficient in predicting accuracy. This may be due to the nature of this 
particular test, which is composed of a series of short subtests, and places 
a premium on quick grasp of directions and rapid work. Few errors are 
made on the test, and other studies have indicated that these errors have 
little significance in predicting on-the-job performance. 


Received December 20, 1943. 








A Study on the Use of a Work Sample 


Marion Steel, Benjamin Balinsky, and Hazel Lang 
Vocational Advisory Service, New York City 


The O’Rourke “Ringing an Electric Bell’’ work sample, among many 
other work samples, has been used experimentally as a device for arousing 
and developing interest in various occupations. About three years ago 
a number of Dr. O’Rourke’s work samples were used in the NYA both 
to give short work experiences in various trades and to provide a measure 
of suitability for training in a specific trade. 

The Vocational Advisory Service tentatively included the “Ringing 
an Electric Bell” work sample as part of its psychological testing program 
about one year ago and set about to evaluate it at the same time. The 
“Ringing an Electric Bell’”’ work sample was chosen because an unpub- 
lished report stated that the time scores of the electrical work samples 
were found to have product-moment coefficients of correlation of about 
.50 with the ratings by foremen on the work of trainees in mechanical 
jobs such as machine shop. Other samples, such as one involving wood- 
working, gave much lower correlations. The “Ringing an Electric Bell’”’ 
work sample took a short time to administer, was cheap in cost, using 
less expendable material, and was easy to set up for the next client.’ 

A try-out of the work sample, preliminary to the actual experiment, 
indicated that dexterities were involved in wiring the unit and also that 
the experience of those working on the sample was a factor to be con- 
sidered. Remarks by those taking the work sample also indicated that 
the material was interesting. With these observations in mind, this 
study was designed to throw some light on the dexterities involved in the 
work sample, the effect of experience and the relative degree of interest 
shown in this material, in comparison with usual tests. 


1 Bulletin: Institute of Educational Research, Teachers’ College, Columbia Univer- 
sity, and the Civics Research Institute, Washington, D. C., Feb. 27, 1940. Kitson, 
Harry D.: Creating vocational interests, Occupations, 1942, 20, 567-571. 

? The work sample was actually found to be practical for time and cost, the average 
time taken to complete the sample being 13 minutes and 49 seconds. It can be stopped 
at the end of 25 minutes since only 8% of all the subjects were still working at that 
time. The only expendable material was the wire and this cost about 1¢ per person. 
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Description of Sampling and Tests 


For a period of two months the “Ringing an Electric Bell’? work 
sample was given to a randomly selected group of clients who came to the 
Vocational Advisory Service for vocational guidance. Selection was 
limited only to the extent that the person was at least an elementary 
school graduate and between the ages of sixteen and twenty-five. This 
age group represents the bulk of the Vocational Advisory Service clients 
who come for guidance. Clients were also given some of the battery of 
tests ordinarily administered in vocational guidance. 

At the end of two months, the sampling consisted of 86 individuals, 
49 males and 37 females. The median age for the total group was 18 
years and 9 months. The median educational attainment was high 
school graduation. Eighty-four per cent of the males and eighty per cent 
of the females had completed either some portion of the high school course 
or were high school graduates. 

In administering the work sample, the following materials were laid 
out before the subject in a standardized manner: an electric bell, a push- 
button, three feet of insulated wire, a number 6 dry cell, a penknife, a 
pair of cutting pliers, a screwdriver, and a ruler. The subject was also 
provided with a sheet of instructions giving detailed step-by-step direc- 
tions as well as diagrams.* Each subject was examined individually. 
The work sample was introduced by the examiner as follows: “This is a 
job in electricity we would like you to try. You do not need to have 
experience with this kind of material. Just follow the directions on the 
sheet. Start right here. (Point) You have all the material you need. 
Work as quickly as you can and be sure to follow the directions.” 

The total time was taken, including the time spent in reading direc- 
tions. Speed was not emphasized but the stop-watch was plainly visible 
so that the subjects might be aware that they were being timed. The 
final instruction was to press the pushbutton and the work was con- 
sidered complete when the bell rang. The work was stopped if it was 
still incomplete after thirty minutes. 

The regular test battery included the following dexterity tests: the 
O’Connor Finger and Tweezer Dexterity tests and the Minnesota Rate 
of Manipulation, Placing and Turning tests. On the Finger Dexterity 
test, in accordance with the practice of the Vocational Advisory Service, 
the total time for the entire board was used as the score rather than the 
score specified by O’Connor. The scoring methods of the authors of the 
tests were used for the others. 


* See original directions and diagrams by L. J. O’Rourke, Civics Research Institute, 
3506 Patterson Street, N.W., Washington, D. C. 
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Experimental Procedure 


The clients were divided into two groups. In group A (forty-three 
subjects), the above-mentioned four dexterity tests were given before the 
work sample and in group B (forty-three subjects), the order was reversed. 
The four dexterity tests were never given to more than two subjects at 
one time and the work sample was always given individually. 

The work sample was administered individually in order to allow for 
careful observation of the handling of the tools and materials and the 
approach to the task. The examiners carefully recorded the use of 
directions, facility in handling the tools and material, initial adjustment 
to the task and the reaction to difficulties as well as spontaneous remarks. 
This was done in order to investigate the possibilities of establishing a 
qualitative rating. Each subject was interviewed briefly after the tests 
and the work sample had been administered. The following questions 
were asked: 1. What have you done before that was like this?; 2. Have 
you done any shopwork in school, repair work around the house, or a 
ear?; 3. Which of these tests did you like best? Why?; 4. Did you try 
harder on one than another? Which one? Why?; and 5. Which was 
the easiest for you? Why? 


Results and Interpretation 


Pearson product-moment correlations were calculated separately be- 
tween the scores on each dexterity test and the time score on the work 
sample. The Minnesota Spatial Relations test and the O’Rourke Vocab- 
ulary test had also been given as part of the battery of tests and Pearson 
product-moment correlations were computed between each of these tests 
and time taken to do the work sample. 

The figures in Table 1 indicate that the correlations are low and that 
many of them are unreliable. Correlations between dexterity tests 
usually run somewhat higher, in the order of .40.4 For the males, the 
correlation coefficients between the dexterity tests and the work sample 
are very low. This might be interpreted as meaning that the work 
sample is not measuring the same functions as the dexterity tests in the 
case of the males. The correlations for the females, although low, are 
higher than for the males and approach the order of .40. The differences 
between the correlations for males and females are significant in only the 
case of the placing test. These results must be interpreted in the light 
of the relatively small samplings. 

The number of male and female cases are too small to make any 
definitive statements about sex differences but the boys made significantly 


‘ Blum, M., and Candee, B., The selection of department store packers and wrappers 
with the aid of certain psychological tests. J. appl. Psychol., 1941, 25, 76-85. 
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Table 1 
Correlations of Work Sample with Various Tests 
Significance 
of eo ol 
Differences 
Test Sex N. r P.E.r Between Sexes 
Finger Dexterity M 49 .077 .095 
F 35 346 .099 2.09 
Both 84 .153 072 
Tweezer Dexterity M 49 .176 .093 
F 35 419 .094 1.83 
Both 84 .293 067 
Placing M 49 —.019 .096 
F 35 498 .086 4.0 
Both 84 .224 .070 
Turning M 49 .097 .095 
F 35 .347 101 1.8 
Both 84 .193 .071 
Minnesota Spatial M 49 .247 .091 
Relations F 35 .394 .096 1.1 
Both 84 266 .071 
Vocabulary M 48 .220 .092 
F 35 317 .103 0.65 
Both 83 .204 074 





higher scores on the work sample than the girls and this did not occur 
on any other test, as can be seen from an examination of Table 2. The 
critical ratio was 4.15 for the work sample and below 1.0 for each of the 
other tests given. 

In order to test the effect of differences in experience between the 
males and females, data on the background of each subject were com- 
piled. The data were available from school records, as well as from work 
histories and accounts of hobbies already obtained from the individuals 
by the counselors. The replies to the questions, ““‘What have you done 
before that is like this?” and “Have you done any shopwork in school, 
repair work around the house, or a car?’’ asked at the close of testing, 
also gave information about experience. 

The degree of experience was rated as none, little or some. Those 
rated as ‘‘none” had no experience at all or only woodworking in elemen- 
tary school. The latter was the base experience for the males. ‘Little’ 
experience was defined as occasional minor repairs at home or as a hobby, 
usually referred to as “fixing things’; or two shop courses, not including 
electricity; or a short time (approximately three months) experience in a 
factory. The criteria for “some” experience were two or more shop 
courses other than woodworking; much and more extensive home or 
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Table 2 
Means, Standard Deviations and Critical Ratios Between Sexes 

Test Sex N. M. 8.D. C.R. 

Finger Dexterity M 49 8.90 1.66 
F 35 9.03 1.67 0.194 

Both 84 8.95 min 1.41 

Tweezer Dexterity M 49 6.15 0.93 
F 35 6.18 1.11 0:132 

Both 84 6.17 min 0.94 

Placing M 49 239.18 21.12 
F 35 242.29 20.15 0.683 

Both 84 240.48 sec 20.46 

Turning M 49 190.10 26.20 
F 35 192.14 21.19 0.365 

Both 84 190.52 sec 22.65 

Minnesota Spatial M 49 9.96 2.18 
Relations F 35 9.80 1.82 0.890 

Both 84 9.90 min 1.96 

Vocabulary M 48 65.95 13.25 
F 35 68.80 15.70 0.769 

Both 83 67.14 14.40 

Bell and Battery M 49 11.79 4.56 

F 35 16.64 5.76 4.15 
Both 84 13.81 min. 5.62 





hobby experience; or an electrical shop course and at least one other shop 
course; or finally three different shop courses, not including electricity. 
Examination of Table 3 indicates a consistent trend for those with 


more experience to complete the work sample in less time. 


Apparently 


experience is reflected in the time scores. Blum and Candee, in the study 
referred to above, found that experience was a factor in raising the scores 
on tests requiring the handling of concrete materials, specifically the 
placing, turning, and finger dexterity tests. They wrote, “Apparently 








Table 3 
Amount of Experience, Median Time Scores, and Interquartile Ranges for 
Each Sex on Work Sample 

Experience Sex No. Median Time Q 
None M 20 12’ 30” 2’ 20” 
F 24 18’ 3’ 45” 
Little M 8 10’ 30” 4’ 00” 
F 5 17’ 3’ 38” 
Some M 20 9’ 50” 1’ 48” 
F 5 9’ 15” 38” 














A Study on the Use of a Work Sample 19 


experience in wrapping does have a slight effect in raising test scores on 
three different tests and in reducing initial differences among the workers 
on the tests.” 

To test quantitatively the relative “interestingness” of the work 
sample, and the other tests, the question ‘‘Which of these tests did you 
like best?’’ was asked. The experimental groups had been divided into 
Group A and Group B. Group A had the dexterities first, Group B the 
work sample first. In Group A, 20 of the males and 11 of the females 
liked the work sample best; 2 of the males and 4 of the females liked one 
of the dexterities best, and 4 of the males and 1 of the females had no 
particular preference. One girl of the 17 girls in the A group refused to 
complete the work sample. In Group B 18 of the males and 12 of the 
females liked the work sample best, 3 of the males and 5 of the females 
liked one of the dexterities best, and 1 of the females had no particular 
preference. One girl of the 19 girls in Group B refused to complete the 
work sample. For the total male group, the standard error of the differ- 
ence between the per cent liking the work sample best and the per cent 
not liking the work sample best was 7.7. This difference is significant. 
For the total female group, the standard error of the difference is 2.6. 
This would mean that the chances are about 9 in 1000 that the difference 
is a chance difference due to sampling. 


Table 4 
Number of Answers to Question, ‘““‘Which of These Tests Did You Like Best?” 

















Group A Group B Total 
Test Liked Best* M F M F M F 
1. Work Sample 20 11 18 12 38 23 
2. A Dexterity Test 2 4 3 5 5 9 
3. No preference 4 1 0 1 4 2 





* One female each in Groups A and B refused to complete the work sample. 


Another question, ‘‘Which was easiest for you?’ was also asked. In 
Group A, only 11 of the males and 4 of the females said that they found 
the work sample easier, 12 of the males and 12 of the females said that 
they found one of the dexterities easier, and 3 of the males and 1 of the 
females said no one test was easier than another. In Group B, only 8 
of the males and 2 of the females indicated that the work sample was 
easier, 10 of the males and 12 of the females found one of the dexterities 
easier and 3 of the males and 5 of the females said no one test was easier. 
Evidently the greater degree of interest in the work sample was not due 
to the fact that it was easier than the other tests. 
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Some examples of replies to the question why the work sample was 
liked follow: “This is more like real work, but those are like child’s 
play.” “Just liked it. Get something out of it, the sound.” ‘Made 
something, know it’s.finished. It works. Something I made.” ‘Has 
more sense to it.” 


Table 5 
Number of Answers to Question, ““Which Was Easiest for You?” 





Group A Group B 
Easiest Test M 





1. Work Sample 11 
2. A Dexterity Test 12 
3. Neither Kind 3 





Tentative Conclusions 


The following conclusions can be drawn from this preliminary study: 


1. The work sample had low correlations with the dexterity tests. 
This might be interpreted to mean that the work sample was measuring 
functions different from those measured by the Finger and Tweezer 
Dexterity tests and by the Placing and Turning tests. 

2. A significant sex difference was obtained on the work sample for 
the group used in this study. This sex difference must be considered as 
preliminary and might possibly be attributed to differences in degree and 
kind of experiences had by the males and females in this sampling. 

3. The amount of experience was related to the time taken to com- 
plete the work sample. Those with “some” experience completed the 
work unit in less time than those with “little” or “none.” 

4. The work sample was liked best by most individuals tested, both 
male and female, although a greater percentage found it more difficult 
than the dexterity tests. 


Recommendations 


Qualitative descriptions of the performance on the work sample gave 
valuable information to the vocational counselors. The descriptions 
were in terms of work habits and attitudes, facility in handling the tools 
and material, initial adjustment to the task and the reaction to diffi- 
culties. The examiners had checked each other for reliability of the 
qualitative description. However, in order to make the work sample 
more applicable to an industrial situation, it is thought necessary to have 





A Study on the Use of a Work Sample 21 


the qualitative descriptions checked by qualified people in industry 
engaged in such work as, for instance, that of electrical assembly. 

The time score on the work sample also needs validation if the sample 
is to be related to success on a job. It is proposed that the work sample 
be tried out in several shops, such as radio or electrical assembly, where 
shop foreman ratings would be available. The ratings could then be 
compared with the time score to test the validity of the time score. 


Received December 6, 1943. 





A Method of Objectively Measuring Shop Performance * 


Marion White McPherson 
Wayne County Training School, Northville, Michigan 


The need for some device for the diagnosis of trainability and for the 
refined evaluation of achievement in wood shop work is well known. 
A survey of the literature reveals few objective measures of performance 
in this area. In their study for the Committee on Human Migration 
the Minnesota group! used actual shop performance to determine the 
validity of their tests. The idea for the compilation of our test originated 
from the methods they used. It was possible that their technique could 
be developed into a short, convenient evaluation and thereby make their 
criteria for the validity of a test the actual performance to be measured. 
Therefore, we have begun an investigation of the practicality of such a 
direct measurement and of its sensitivity to continued wood shop experi- 
ence. Refined data regarding the reliability and the prognostic value of 
this device, once developed, are matters for further research. 

For the copying of a model wood block to constitute a satisfactory 
measure of wood shop achievement, it must be amenable to objective 
scoring and it must include as many as possible of the basic activities. 
To determine these we were assisted by our wood shop teachers? since 
they were able, out of their experience, to identify the important opera- 
tions and the precision which our mentally defective population might 
be expected to achieve. Work with at least the saw, drill and chisel 
should be included. The product should be scored with respect to 
accuracy of dimensions, angles, and locations; to method of determina- 
tion of positions of operation; and to neatness of execution. In addition, 
the convenience of the device would be increased could the sampling be 
integrated into a single piece of wood. All of these considerations 
entered into the construction of the model. 


*From the Wayne County Training School, Robert H. Haskell, M.D., Medical 
Superintendent, Northville, Michigan. Studies in the Psychopathology of Childhood 
and Mental Deficiency, supported by a grant from the McGregor Fund, Detroit, Michi- 
gan. Report No. 65. The achievement measurements described in this paper were 
devised by A. A. Strauss, Z. P. Hoakley, and L. C. Sullivan. 

‘Paterson, D. G., and Elliott, R. M., et al, Minnesota mechanical ability tests. 
Minneapolis: The University of Minnesota Press, 1930. 

2 We wish to express our appreciation to Mr. Edmund Crosby and Mr. Norman 
Running. 
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In order to assure uniformity of approach the model was presented 
in four consecutive stages of completion. Each of the four blocks was 
10 inches long, 54% inches wide, and % inch thick. The first block was 
a plain board, cut to certain dimensions; the second had the hole drilled 
in the left side; the third had the above and also the central groove; and 
the fourth was the completed block. These blocks, in a left to right 
progression, were hung on the wall in front of the subject but beyond 
his reach. 

To insure objective evaluation, a scoring pattern was developed. 
This consisted of an outline of the model, drawn on a sheet of transparent 
plastic, that could be superimposed on the subject’s completed block. 
In addition to the outline of the pattern, lines were drawn at uniform 
intervals to indicate the degrees of deviation of the product from the 
model. The appropriateness of the intervals was determined by the 
precision which, with training, these children might be expected to 
achieve. For example, lines were placed at each one-eighth inch interval 
for a reasonable distance at either side of the lines marking the correct 
length of the board. The line marking each correct position was assigned 
a definite value which was reduced by one point for each unit of deviation. 
That is, a block of the correct length was scored 10; one that was one 
unit (14) either too short or too long scored 9; one that deviated in either 
direction by as much as two units (14"") received a credit of 7, etc. 

In an attempt to determine the feasibility of measuring and scoring 
performance with the method outlined in the preceding paragraphs, the 
model was presented to fifty-nine boys of the Wayne County Training 
School who were enrolled in the wood shop course in the academic year 
1940-41. As enrollment in this shop is a part of the natural sequence of 
the training program for boys, there were no known extraneous factors 
operative in the selection of subjects. All were in their thirteenth year. 
The mean Binet P.C.* was 86.49, 8.D. 6.31; the Arthur Performance P.C. 
89.93, S.D. 6.78. 

One boy at a time was admitted to an enclosure of approximately 
10 by 40 feet which had been partitioned off at the end of our wood shop. 
Here he had access to a variety of tools including different sizes of chisels, 
saws, drills and, of course, the correct implements for the assigned task. 
He was given a board of the width and thickness of the models but at 
least one yard in length and was told to make his board look first like 
the one hanging in front of him at the extreme left, then like the second, 
then like the third, and finally like the one at the extreme right. No 


* Hilden, A., Table of Heinis personal constant values. Minneapolis: Educational 
Test Bureau, 1933. 
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more specific instructions were given. He was allowed to work without 
time limit and with only casual adult supervision. 

One semester later fourteen of the subjects were again presented with 
the problem of reproducing the model. This was done in order to meas- 
ure the change in score which would result from attendance in the wood 
shop for two hours per day during the half year. 

When the subjects had completed their blocks, a technique for quan- 
tifying the performance was developed.‘ Each raw score was multiplied 
by certain numbers ranging from 1 to 6 depending upon the judged 
difficulty of the task and the number of times the particular operation 
was scored. Neatness and method of determination of position of opera- 
tion were evaluated on a point scale. The number of intervals on the 
scale was determined by the units that gave the best approximation of 
a normal probability curve. The total possible score of 300, divided by 
3, left a maximum score convenient to treat and sufficiently large to 
express fine gradations of skill. 

That our scoring method is reliable is indicated by the results of a 
brief study. After eight months of no contact with this research the 
psychometrician, who originally scored the boards, rescored them. Al- 
though 82 readings are required on each of the 59 boards the Pearson 
product moment coefficient of correlation between the two scorings was 
+.97, +.01. 

Table 1 


The Means and Standard Deviations of the Raw Scores on the Wood Shop Achieve- 
ment Measurements Obtained by the Entire Group and the Training Group 














First Scoring Second Scoring 
Number Number 
of Boys Mean 8.D. of Boys Mean 8.D. 
Entire group 59 59.51 14.05 — — — 
Training group 14 55.93* 15.52 14 70.92* 16.00 





* Fisher’s ¢ for related measures = 4.42, significant at the 1% level. 


Table 1 indicates the mean and standard deviation of the raw scores 
for the entire group and for the training subgroup. Table 2 presents in 
terms of frequency the changes in scores of the 14 subjects after a half 
year in the wood shop. Fisher’s test of significance indicates that the 
mean gain of 15 points between scores on the first and second measures 

‘See manual and record blank for detailed description. Models, manuals, scoring 


patterns and record blanks for both wood and metal shop measurement may be pur- 
chased from C. H. Stoelting Company, 424 N. Homan Avenue, Chicago, Illinois. 
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would be found less than one per cent of the time if the means did not 
differ significantly from zero. 

A Pearson product moment coefficient of correlation between the raw 
scores for the 59 subjects and their Binet ratings was found to be +.43, 
significant at the one per cent level; a similar coefficient between these 
values and the Arthur ratings was found to be +.54, significant at the 
one per cent level. 


Table 2 


The Amount of Change in Scores Between the First Measurement and the One 
Following Training in the Wood Shop (14 individuals) 











Amount of Change Frequency 
-—llto-— 7 1* 
—- 6to— 2 4° 
—- lto+ 3 0 
+ 4to+ 8 2 
+ 9to +13 3 
+14 to +18 0 
+19 to +23 2 
+24 to +28 3 
+29 to +33 2 

T=14 
Mean: +15 





* Considering the instability of a number of our children the two training score 
decrements are not unexpected. 


The distribution of the raw scores and the retest gains for our group 
indicate that the technique has value. The application of this measure 
to a normal or superior population may demand changes in the precision 
units and the establishment of norms suitable to that group. Thorough 
investigation of this technique involves control of academic subject 
achievement, pre-school shop experience in wood work, verbal or non- 
verbal superiority, race, bilingualism, etc. 


Measurements for Metal-Shop 


The wood-shop study has indicated that an activity can be measured 
directly without the necessity of evaluating performance through the use 
of tests that merely sample behavior. There is no a priori reason why 
such a technique could not be extended to meet the needs of other shop 
activities, for example, those of a metal shop. 

To investigate this, another staff conference was held. The impor- 
tant activities in the metal shop were isolated, their relative difficulty 
determined, and the precision which our children might be expected to 
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attain identified. Inspection of the work sample evaluated in the Minne- 
sota study (1) did not reveal any one product that would sample as many 
activities as we desired. Consequently, we developed four patterns, the 
reproduction of which necessitates wire bending, sheet metal soldering, 
riveting, locked seaming, wiring, and circular and angular cutting. 

Thirty-three 14-year-old boys were asked to reproduce the model. 
Only the tools necessary for the task were present and all the patterns 
were accessible throughout the reproduction. In this situation the 
measurement is one of the efficiency of the use of tools and does not in- 
volve their selection. No specific instructions were given to the subjects. 

The scoring pattern is of the same type as that used in the wood-shop 
study. It can be superimposed over the bent wire, the cut pattern, and 
the folded metal. Although we have been able to give but superficial 
study to this device, we have compiled the manual and scoring key as a 
means of presenting in detail the important operations that are amenable 
to objective rating. 


Received February 11, 1944. 














An Analysis of Absenteeism in One War Plant 


Neal G. Schenet 
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To say that absenteeism is of prime importance in industry today and 
that it is one of the largest problems to be faced on the home front is 
redundant. The mere fact that the personnel man in industry now is 
confronted with the subject wherever he turns, in articles and on the air, 
as well as in the plant, is proof enough. 

It will be the purpose of the present study to determine the nature 
and extent of absenteeism in a typical war production plant with special 
reference to the individual and the collective effects of age, sex, and 
length of company service. Sex differentials are always an obvious start- 
ing point, and a review of the literature on the subject shows that age 
and length of company service would be of interest also. Almost any 
other differential could have been used, such as physical characteristics, 
intelligence test scores, etc., but these appear, at least on the surface, to 
have little relationship to the total problem. This led to the choice of 
the variables used in the present study. 


Review of the Literature 


In reviewing the available literature on absenteeism, it was found to 
fall generally into three classes; namely, data on causes, discussion of 
records and methods of study, and suggestions as to remedies. 

Most of the articles have been unscientific in their approach, omitting 
all or nearly all statistics on the subject, making only broad generaliza- 
tions, and usually being on a speculative, rather than on a practical level. 
Estimates, in these writings, as to the number of hours lost, types of 
absences, etc., vary widely. Bearing this out, one governmental source 
states, “There is no statistical information available to indicate the 
general extent of absenteeism in the war industries. Scattered reports 
from a number of factories reflect rates ranging from between two and 
three per cent to fifteen per cent or more” (11:2). Certain general 
characteristics of absenteeism, as given by several governmental reports, 
may be mentioned. One source states that “‘. . . absenteeism rates are 
generally higher for women than for men, even on jobs of the same gen- 
eral character. . . . Greater sickness rates among women are probably a 
factor in their higher absence rates . . .”’ (11:2). 


27 








28 Neal G. Schenet 


On a basis of comparison of male and female rates we learn, “‘One 
large war plant reports current absence rates at 4.8 per cent for men and 
7.4 per cent for women. . . . These figures are typical of a number of 
reports” (11:2). Age may be a factor of importance since ‘‘a study 
undertaken by one company indicates that absenteeism tends to be 
higher among older workers, increasing rapidly after forty or fifty years 
of age”’ (11:3). 

There seems to be a series of effects which cause a tendency for 
absences to be numerous on days adjacent to a week-end or holiday. 
“These effects frequently combine to produce . . . the highest (rate of 
the week) on Saturday” (11:3). 

Absence figures appear to indicate that offices, tool cribs, and super- 
vision show lower rates than factory work generally. There is, how- 
ever, no information available, to the writer’s knowledge, to indicate 
whether absences tend to be relatively more frequent on routine as com- 
pared with nonroutine work, or on heavy versus light work. 

The definition of absenteeism suggested by the United States De- 
partment of Labor, and used throughout this paper, is as follows: “Ab- 
senteeism is the absence of a worker during a full shift that he is scheduled 
to work” (11:1). 

In conclusion, as far as the writer could determine from going through 
the literature, very little work has been done in the field of causes of 


absenteeism, or more specifically, in the field of the effects of certain 
variables upon the total field of absences and absenteeism. 


Materials and Methods 


In order to make a meaningful and scientific survey of the problem 
and yet to keep it within reasonable limits as to size and scope, it was 
decided to break the absence figures down by (1) age groups, (2) length 
of company service groups, and (3) sex groups. 

For the purposes of this investigation it was decided that one of the 
plants of the Elgin National Watch Company, by which the writer is 
employed, be used because of comparative ease of access to absence 
records, familiarity of the writer with the plant and its personnel, and 
size of the plant. The plant chosen was Company Plant No. 2, manu- 
facturing a mechanical time fuze for the armed forces. This work is of 
a fine, precision nature, requiring in general a higher type of employee 
than the average factory. The plant has an average labor force of 
approximately 850 to 900, with about 65% women and 35% men. The 
regulation work week at present is six eight-hour days, with Saturday 
an overtime day and not, in the past history of the company, normally 
a working day. 
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The period of study used was the first four months of 1943: January, 
February, March, and April. The days of the week studied were Mon- 
day, Tuesday, Wednesday, Thursday and Friday. Saturday was omitted 
specifically because of the fact that absences on this day become of 
importance in and of themselves for certain very definite reasons. For 
example, Saturday is the last day of the week and fatigue will enter into 
the situation; also, Saturday is normally an overtime day and as such 
is considered as “different’’ by the average employee. The writer found, 
merely by inspection, that these statements are true in this plant as 
shown by the unusually high number of absences for this day in propor- 
tion to all others. It was felt, as a result of these observations, that 
Saturdays should be removed from the study, and if surveyed at all, 
should be the subject of a further, separate study. 

The age groups chosen were (1) thirty years of age and under, (2) 
between thirty and forty years of age, and (3) over forty years of age. 
The length of company service groups chosen were (1) three months and 
under, (2) three to six months, and (3) over six months. Subsequently, 
it was found that the length of company service groups were somewhat 
out of line, being weighted on the side of longest service. Thus, the 
probable error of results found in some of the service groups will be larger 
than if the cases had been more evenly divided. However, in spite of 
this skewed distribution, the facts obtaining in the small groups appear 
to corroborate reasonably well those in the larger groups. 

In determining which statistics to use, it was decided to use the 
number of days lost and also, to disregard any absences under one day 
in length. This last decision was made because of the labor of handling 
the data and because, upon inspection, the basic facts appeared to hold 
true regardless of the length of the absence. “It is . . . difficult (and 
usually unnecessary) to tabulate part-day absences and there is, for 
example, no obvious line of demarcation between part-day absenteeism 
and tardiness” (11:1). 

Individuals used as subjects in the survey were classified as to age 
and length of company service at the beginning of the period; that is, 
on January 1, 1943. Because of a possible error introduced by employees 
entering and leaving the service of the company during the period, it 
was determined to use only those individuals who remained active in one 
specific department of the plant during the course of the study. The 
total number of subjects obtained in this manner was 750; 280 men and 
470 women. 

The general method used in obtaining the raw data was to use the 
figures on the company’s “Daily Time Exception Sheet,”’ which is made 
out by each department and turned in to the Payroll Department daily. 
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On these sheets, only those absences listed as “‘S’” (Sickness), ‘‘WP”’ 
(Without Permission), and ‘“TR”’ (Time Requested) were used, all others 
(Plant Accident, Vacation, etc.) being omitted as not pertinent to the 
survey. In compiling these statistics and throughout the study, WP 
and TR have been combined into one group which will be referred to 
as “‘P”’ (Personal). 

A brief word of explanation for this procedure is in order here. Be- 
cause of the method of marking absentees in this plant, it is felt that 
both WP and TR actually contain a great deal of each other and should 
not, under the present conditions, be separated for the purposes of this 
study. Absences are reported either by telephone, by a friend of the 
absentee, or in person when the absentee returns to work. As a result 
of this, if a person called his department and stated, ‘I will be absent 
today because I am going to see the doctor,’”’ one department may class 
it as WP because no prior permission was obtained, while another de- 
partment may class it as TR merely because the person was kind enough 
to telephone and not leave them in doubt. This procedure has been 
verified by the writer in personal conversations with the various depart- 
ment heads. Since we cannot separate these intangible amounts of WP 
in the TR heading it appears more logical to group them together. 

No attempt was made to show the duration of the absences, each 
full-day absence being listed as a separate item regardless of whether or 
not it appeared in connection with other full-day absences. That is, 
while we may know how many full days an individual was absent during 
the calendar month, it was not recorded whether or not those days 
absent were scattered throughout the month or localized in one long 
absence. 

When this master list of absences was completed it was possible to 
obtain the number of days lost for sickness and personal reasons by 
department, sex, age, length of service, or by any one or more of these 
variables without regard for the others. Distribution of the data by 
these groups was facilitated by entering the pertinent facts upon a series 
of index cards, one for each employee in the survey. These cards then 
contained (1) the name and department of the employee, (2) the age 
group of the employee, (3) the length of company service group of the 
employee, and (4) all absences listed for the employee broken down by 
calendar month as well as reason for absence. These cards were then 
distributed by the groups listed above. 

The absence figures were then used to calculate the absence rates, by 
Number of days lost by group _ 

Number of persons in group 

By means of this calculation it is now possible to compare any group 

with any other group or combinations of groups, since we are dealing 


means of the following formula: Rate = 








An Analysis of Absenteeism in One War Plant 31 


in rates rather than raw data. Also, in order to provide a basis for 
comparison with other sets of statistics, both national and local, the 
absenteeism rates by men and women for the entire four month period 
were figured through the use of the following formula, suggested and 
used by the Bureau of Labor Statistics: 


Man-days lost < 100 
Man-days scheduled to work 


Man-days scheduled to work = Number of persons in group X 86, 
the number of work days involved 
in the study. 


Absenteeism Rate = where 





There may be some question as to the use of two different formulas 
for computing absence rates. The reason for using the first formula is 
to simplify the labor involved, since the second formula requires an addi- 
tional computation. It is not necessary to have the rates computed by 
the Bureau of Labor Statistics formula in all the individual groupings, 
since there is no method of checking and comparing such figures with 
national or area rates. There obviously is a one-to-one correspondence 
between the writer’s formula multiplied by 100/86 and the Bureau of 
Labor Statistics formula. 

Upon inspection of the data, several differences appeared to be out- 


standing, and in order to determine whether they were significant and 
real differences or whether they might easily have occurred due to chance, 
standard deviations of certain of these items were computed. From 
these, critical ratios were determined, and results will be discussed later 
in this paper. 


Results and Discussion 


In a discussion of the results of the present study, it would be most 
pertinent to begin with the field of sex differentials in total and proceed 
from there to the differences within each group and the age and service 
differences which appear to stand out upon inspection of the data. 

The most striking fact is that the female rates are proportionately 
much greater in‘almost every group than the male rates. This appears 
true throughout the entire Plant, in all departments, service groups, age 
groups, and totals. In total absences in all departments the female rate 
is exactly three times the male rate, that for men being 1.3 while that 
for women is 3.9. For sickness absences the rate for women is twice the 
male rate (1.9 and 0.7). In personal absences the difference is even 
greater, the female rate being between three and four times that of the 
men (2.0 and 0.6). (It will be recalled that all rates may be thought of 
as “average number of days lost per employee.’’) While these results 
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Table 1 


Absence Rates for Male Employees in Four Largest Departments and for Total of All Departments Included in the Survey 
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are in line generally with those obtained in other studies it would seem 
that the differences are even more pronounced. 

A statistical investigation of these results yields, in the case of male 
totals versus female totals, a critical ratio of 8.6, indicating a very high 
probability that the difference is a real and significant one. In the case 
of male, sickness, versus female, sickness, the critical ratio is 5.2, and 
it is 9.0 for male, personal, versus female, personal. 

One other fact of interest is that for the total group the sickness and 
personal absences are about evenly divided, the former accounting for 
49% of the total and the latter for 51%. This is at variance with a 
statement made by Spriegel and Schulz that “In a study of 10,000 em- 
ployees of both sexes it was found that the causes of their absences were 
40% due to sickness and accident and 60% due to personal reasons”’ 
(10:164). However, this may also be due to a difference in definitions. 

In the field solely of male rates, certain factors appear to be of interest. 
The rates for sickness and personal absences are approximately the same, 
being 0.7 and 0.6, respectively, showing a fairly even distribution of 
absences on this basis. Of the length of company service groups, Group 
Three, or Over Six Months, appears to be the highest, especially from 
the standpoint of sickness absences. This would have no apparent ex- 
planation other than the possible fact that as length of service increases, 
more adherence to the rules of reporting absences follows, which would 
lead to classifying many absences as sickness which would otherwise be 
classed as personal absences, because of not being reported. The critical 
ratio here for men with more than six months service versus men with 
zero to six months service (groups combined because of the small number 
of cases in each) is 4.1. 

The age groups of the men do not appear to be related significantly 
to absence rates. Group Two, thirty to forty years of age, is slightly 
lower in sickness absence rates, but not appreciably so. This seems to 
be at variance with the statement made by Watkins and Dodd that, 
“. . . the time lost by male workers below the age of forty on account 
of illness tends to be lower than the average male disability, but beyond 
forty, males show a rapidly increasing morbidity rate” (15:266). They 
also state elsewhere, “Experience indicates that youthful employees are 
more careless in the matter of punctuality and attendance than are more 
mature workers” (15:265). While this may be true in general, the 
present study does not seem to show this to any great degree in this 
particular plant. The fact that these findings do not agree entirely with 
the results of other studies may be due to certain special factors obtaining 
in this specific plant. However, it is evident that the results are showing 
a general trend toward agreement, since both the younger and older 
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workers are slightly higher in rate than the intermediate group, as far as 
sickness absences are concerned. 

In the field of female rates, it is apparent that here too, the sickness 
and personal absences are approximately the same, the rates being 1.9 
and 2.0 respectively. Here, however, Service Group Two, three to six 
months, is highest in rate, being 1.1 higher than the next group and 1.6 
higher than the lowest. There appears to be no logical explanation for 
this unless, as the writer feels, there is a period of orientation during 
which the new worker tends to be quite regular in attendance, followed 
by a period of laxity in reporting absences, caused, perhaps, by increasing 
familiarity with the plant, after which there is a lapse into a somewhat 
steady groove of cooperation. From the statistical standpoint, this 
difference is not as significant as some others, for the critical ratio in the 
case of women with three to six months service versus women with over 
six months service is only 1.4. However, even this critical ratio indicates 
a probability of .92 that the difference is real. 

Age group one, 30 and under, is no higher than the other groups, 
somewhat reversing the findings of Watkins and Dodd from the stand- 
point of the increased absenteeism of younger workers. However, age 
group three, over 40, is slightly higher than the other groups, bearing 
out their statement, “In the case of female employees the rate remains 
less than the average up to age thirty, but increases beyond that point”’ 
(15:266). In the female rates, the service groups show more variation 
than do the age groups, as in the case of the male rates, tending to point 
to the fact that length of service is more important than age. 

In the realm of departmental differences, the outstanding fact is that 
Final Assembly has more absenteeism than any other department, of the 
four largest ones. (In discussing departmental differences, only the 
largest departments will be used, since they contain nearly 89% of the 
total number of cases.) This holds true in sickness and total absence 
rates, and is evident in the rates of both men and women employees, 
although in the case of men the rate is not markedly higher than in the 
other departments. In this department the sickness rate of both male 
and female is much higher than in the other departments, whereas rates 
of personal absence are only higher in the case of women, and then just 
moderately so. In the case of men, sickness, Final Assembly versus 
men, sickness, Sub-Assembly, the rates are 1.1 and 0.4 respectively, with 
a critical ratio of 1.2, indicating 88 chances in 100 that the difference is 
real. For women in the same comparison, the rates are 2.4 and 1.5 
respectively, with a critical ratio of 2.4, indicating 99 chances in 100 that 
the difference is real. Thus, the difference for women is more significant 
than for men. 





36 Neal G. Schenet 


Upon first glance this may seem as though there is relatively more 
sickness in this department than in others. This is undoubtedly true, 
but along with this, the personal absences, at least for women, are also 
considerably higher than in the other departments. This particular 
department, having more trouble with absenteeism than the others, has 
been conducting intensive educational campaigns on the subject. Their 
main emphasis has been upon the need for employees to report their 
absences whenever possible. Also, strong punitive measures have been 
instituted in this department, so that, for example, three unexcused 
absences bring about discharge of the absentee. However, as shown by 
the absence rates, these practices have not greatly reduced the total 
problem. . 

It has been the general hypothesis in this department that the younger 
workers caused the majority of the absences here, due to sickness, irre- 
sponsibility, ete. The rate charts do not bear this out, however, and, 
as a matter of fact, they point out that the younger persons in this de- 
partment (particularly the women) tend to have lower rates than do the 
older employees. 

It is extremely interesting, as well as puzzling, to note that the ab- 
sence rates in general for Final Assembly are appreciably higher than 
those for other departments, in spite of the educational work on the 
subject and the disciplinary measures in effect. There is nothing appar- 
ent in the attitude of supervision here which would tend to influence the 
absence rates, and the psychological factor of working on a finished 
product rather than a small, perhaps unrelated part, should have a con- 
structive influence also. Since the employees must report their absences 
more carefully, it is possible that there is a tendency for the sickness 
absence figures to increase, for it is much easier for an employee to state 
that he is ill than to state that he wants the day off for shopping, or a 
trip. However, this does not explain the increased personal absences 
and the resultant increased totals, and there is nothing in the study which 
would give the answer to the unusual situation obtaining here. 

Further study of departmental differences indicates that in both male 
and female, the Plate Department has a higher personal absence rate, 
appreciably so in the case of women. For example, in the case of women, 
Plate, personal absences versus women, Sub-Assembly, personal ab- 
sences, the rates are 4.2 and 2.0 respectively, with a critical ratio of 5.6. 
Such a difference is not evident anywhere else in the departmental 
statistics. ‘Two factors inherent in the department itself have probably 
combined to cause this difference. First, the type of work in this depart- 
ment is large, rough and dirty as compared to that in the other depart- 
ments, and secondly, the production schedules in this department have 
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been such that there were peaks and valleys in the amount of work 
available to the employees. Selection of employees for this department 
generally led to placing less skilled labor there, and such persons tended 
to be rather irresponsible in their attendance records and in reporting 
their absences. Also, since production varied so greatly, a certain 
amount of resentment and lack of interest on the part of the employees 
was detectible, leading to higher absenteeism. During the period of the 
survey, a temporary lay-off was in progress in this department, and while 
these figures were not included in this study, the psychological effect 
upon the remaining employees, from the standpoint of their apparent 
need to the plant and their job security, may have been reflected in their 
attendance records. 

A brief word is in order from the standpoint of the total plant as 
compared to other plants in the same area and to national figures. In 
general, the absence rates are very low compared to other similar plants, 
perhaps due to a concentrated program against absenteeism and the 
generally higher type of employees, on the whole, as a result of the pre- 
cision quality of the work. The following table gives the pertinent facts: 


Table 3 


Percentage Absence Rates, April 1943, and Percentage Women Among Wage 
Earners, April 1943, in Selected Industries * 





Absence Per cent of 
Industry Rate Women 
Ammunition (National) 5.4 15.8 
Explosives (National) 3.8 15.3 
Instruments and Optical Equipment (National) 6.3 36.6 
All Reporting Manufacturing Establishments (National) 6.2 22.3 
Manufacturing Establishments in Elgin, Illinois 5.2 29.7 
Company Plant No. 2 in Elgin, Illinois 3.42 65.0 








* Sources: National Figures: Bureau of Labor Statistics, U. 8. Dept. of Labor. 
Elgin Figures: Personal Survey by the writer. 


These figures are of special interest mainly because of the fact that 
in Company Plant No. 2, although the percentage of women is much 
higher than in the other industries, the absence rate is substantially 
lower than the average. In other words, the total situation is very good 
when compared to both area and national figures. 


Summary and Conclusions 


In general, as a result of the study, the following facts become evident: 
1. Women have three times as much absenteeism as men, in total rates. 
2. Women have approximately twice as much sickness absenteeism as do 
men. 
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. Women have between three and four times as much personal absentee- 
ism as do men. 

. It is apparent that these differences are the result of the sex variable, 
since they are evident in every age and service group and department. 

. In general, sickness and personal absences were nearly evenly divided. 

. Age groups show no great difference in rates, although there is a slight 
tendency for the older employees to be absent more. 

. Service groups show more variation, with the rates tending to increase 
as service increases, up to a point at about six months of service. 

. Final Assembly Department has the highest rates in both sickness and 
total absences of all the major departments. 

. A striking difference is shown in the case of the personal absences for 
women in the Plate Department, the rate here being much higher than 
in any other major department. 

In conclusion, it would seem that the findings of this study on the 
subject of sex differentials tend to agree with other similar investigations 
as reported in the literature. The age groups show no tendency toward 
being of importance in influencing the absence rates, and the length of 
company service factor would appear to be of more importance here 
from the standpoint of influence upon absenteeism. 

Probably the most important findings of the study aside from these 
mentioned above, are in the uncovering of the two striking departmental 
differences, that is, Final Assembly in total absenteeism, and Plate in 
personal absences for women. While the results in the case of Final 
Assembly are not such that it is possible to give the answer to this situa- 
tion, they do serve to focus attention upon the problem and to disprove 
one possible hypothesis; namely, that the younger workers are causing 
the majority of the absences in this department. 


Received .N ovember 3, 1943. 
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Testing the Pulling Power of Advertisements by the 
Split-Run Copy Method * 


Joseph Zubin 
Columbia University 
and 


John G. Peatman 
The City College of New York 


Many different methods are employed by the advertising industry for 
the purpose of determining the effect of advertising copy on sales. One 
of the most common methods is that of “split-run copy” testing. 

The method itself can be briefly described as one in which two (or 
more) forms of a given advertisement are printed in a newspaper or 
magazine of a given issue, the advertisements being alternated in the 
production of the publication medium so that the different forms will be 
randomly distributed to the reading public. When properly carried out, 
such a method should achieve the desired end of securing two randomly 
selected groups of the population under study, equal in size, one of which 
is exposed to the first form of the advertisement and the other to the 
second form. 

The relative pulling power of each form of the copy is then measured 
by the number of replies received. All copy tests therefore need to 
include an offer of some article such as a free sample of the product or a 
souvenir. This article should be sufficiently attractive to call forth a 
volume of response large enough to be subjected to a statistical test of 
the significance of the difference in pulling power between the two forms 
of the advertisement. The free offer needs to be towards the end of 
the advertising copy and relatively inconspicuous in order to insure that 
the response is brought about by the reading of the copy itself rather 
than by the attractiveness of the free sample or souvenir alone. 

An example of a split-run copy test is given in an article by Manville 
(3) in which the problem was “to determine the relative pulling power 
of the words ‘False Teeth’ versus the words ‘Dental Plates’ in headlines”’ 
of copy advertising Polident, a cleanser for false teeth. Two split-run 
tests were made, one in the New York Times Sunday Magazine Section 

*The authors wish to thank Miss Jane E. Farwell for drawing the nomographs 
included in this paper. 
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and the second in the New York News Sunday Rotogravure Section. 
The results obtained are summarized by the author as follows: 

1. The New York Times split-run copy test, made March 9, 1941, 
yielded a number of replies, 51.4 per cent of which were from the F.T. 
(“False Teeth’) copy and 48.6 per cent were from the D.P. (“Dental 
Plate’) copy. The author states that “this is what may be called an 
inconclusive result; where separation between the competing advertise- 
ment shows less than ten points (not 10 per cent) difference. Polident’s 
experience has shown that no material difference exists between two 
tested advertisements. However, in conclusion, note here that ‘False 
Teeth’ was a shade better.” 

2. The results of the second split-run test made in the N. Y. News 
Sunday Rotogravure Section on September 20, 1942, yielded a set of 
replies, 52.5 per cent of which were from the F.T. copy and 47.5 per cent 
from the D.P. copy. The author presents these results, stating “here is 
another test run to verify results from the first test. . . . Again ‘False 
Teeth’ was a shade stronger—almost mathematically perfect.” 

Since the author gives only the percentage of returns for each copy 
but does not give the total number of returns, it is impossible to evaluate 
the significance of his results on a statistical basis. Furthermore, even 
if the author had given the absolute frequencies instead of the percent- 
ages we would still be unable to treat the results statistically since we 
have no knowledge of the number of readers who are potential buyers of 
the product but failed to respond. The latter data are rarely if ever 
known even to the research worker. If we wish to evaluate the above 
results despite this deficiency, certain assumptions must be made regard- 
ing the number of replies as well as the total number of readers who are 
potential buyers of the product advertised. 

We shall present in this paper several general methods for evaluating 
the results and apply them to Manville’s data as an example. In order 
to test the significance of his results, we shall proceed on two alternative 
assumptions: 

Situation (A): that he received the minimum average cited by Sturgis 
(4) of 100 replies per tested advertisement or 200 for both; 

Situation (B): that he even may have received as many as 500 replies, 
on the average, for each of his tested advertisements or 1000 for both. 

Inasmuch as the greatest difference in the two split-run tests made by 
Manville was obtained from the Daily News, we shall confine our ex- 
amples to those figures, namely, 52.5 per cent for the F.T. copy and 
47.5 per cent for the D.P. copy. 

11t is clear that on statistical grounds alone, this statement is not tenable. The 


absolute difference between two per cents can not be evaluated directly, but must be 
referred to its standard error. 
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If Manville received a total of 200 replies in this test, he would have 
received 105 (52.5%) from the F.T. copy and 95 (47.5%) from the D.P. 
copy. 

If, on the other hand, we assume that he received as many as 1000 
replies, he would have received 525 from the F.T. copy and 475 from the 
D.P. copy. 

With an estimate of the actual number of replies received for each 
copy, we are in a position to test the significance of the results, that is to 
say, whether the F.T. copy really had more pulling power than the D.P. 
copy or whether the difference is such as to be attributable to chance. 
Unless we assume, however, that the replies were obtained from a very 
large sample ? of potential buyers of the product exposed to the advertise- 
ments, we will need to estimate the size of the sample, N, before we can 
make a statistical test for the significance of the result. We shall illus- 
trate the development of the test of significance for both circumstances, 
that is, the one in which we assume a very large sample (100,000 or more) 
and the other where the sample may be relatively small (say 2000). 

Both procedures assume that the two versions of the copy to be tested 
are equally and randomly distributed among readers who are potential 
buyers of the product. It should be apparent that an estimate of the 
size of such groups would not in this particular case be equal to the total 
circulation of the paper on the day of the test. Not every one of the 
readers of the paper would be a potential buyer of the cleanser. The 
actual circulation of the News on the day of the test was reported as 
2,175,429. It is, therefore, safe to assume that the actual number of 
potential buyers exposed to the copy was considerable. 


Table 1 
Hypothetical Distribution of Responses to the F.T. and D.P. Copies 








Potential Buyers, Exposed to Copy F.T. Copy D.P. Copy Totals 
Responded a c ate 
Did not respond b d b+d 

Totals ~* N/2 N/2 N 





Assuming then that the two versions of the advertisement reached 
groups which were equally saturated with potential buyers, we can draw 
up the hypothetical two-by-two table shown in Table 1. 


2 Large as compared to the number of returns, The discussion of the importance 
of the relative size of N will be treated later. Note that the “Sample” includes all 
potential buyers of the product who were exposed to the advertising copy, and not 
merely the number of those actually replying. 
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where a is the number of potential buyers who read the F.T. copy and 
responded, b the number of potential buyers who read the same copy and 
did not respond. The letters c and d represent the corresponding data 
for the D.P. copy; N is the total number of readers who are potential 
buyers, half of whom, N/2, read the F.T. copy and half the D.P. copy. 

Let us define the “‘pulling power” of an advertisement as the propor- 
tion of potential buyers who read the copy and were sufficiently moved to 
mail in the coupon at the end of the copy. The pulling power of the 
F.T. copy is the proportion a/(N/2), and for the D.P. copy, c/(N/2), or 
2a/N and 2c/N respectively. 

In order to determine whether the difference between the two “‘pulling 
powers’’ is significant, we apply the simple test for the significance of the 
difference between two per cents. The critical ratio of this difference, 
CR, is: 

2a/N — 2c/N (a — c) 
(I) CR = = 
Vpql2/N +2/N] ~vVpqN 


where p = (a + c)/N andgq = 1 — (a+ c)/N. 

This equation is somewhat different from the usual one given in some 
elementary texts but it is the more correct form and the justification of 
its use is given elsewhere (5). 








a-c 
(II) Hence, CR = Va+oll—ato/N] 








In order to remove the square root sign, we can square both sides of 
the equation and obtain the expression for Chi-square. 


am (cR¥= x= (0-0 /[@+o(1-*5*)| 

Fisher and Yates (2) have pointed out that whenever the smallest 
expected frequency (number of responses expected for a given copy when 
chance alone or some other definite hypothesis is assumed to be opera- 
tive) is less than 500, a correction for continuity (designated by them 
as the Yates correction) should be applied. This consists of simply 
reducing the net value of (a — c), the difference between the number of 
responses for the two copies, by unity. Hence equation (III) becomes 


aq) = #=[la—el - 17 /[@t+o(1-2#)| 


Let us now apply this equation to test the significance of Situation A 
where the total responses were 200, 52.5 per cent for the F.T. copy and 
47.5 per centfor D.P. copy. First, however, we must make some assump- 
tion regarding the sample size, N, the total number of potential buyers 
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who read the copy, regardless of whether they responded or did not 
respond. . 

Case I—Few Replies from a Large Sample: Number of returns very 
small compared to the total number of potential buyers exposed to copy. 

Let us assume for Case I that the number of returns is negligible 
(less than 1%) compared to the total number of readers who are potential 
buyers. In this instance, equation (III’) reduces to a much simpler 
form, as follows: 


If the proportion, (a + c)/N, is negligible, or approaches zero, then 
(IV) x? = (la —c| — 1)*/(a + ¢) 


and for situation A: x? = (9)?/200 = 0.40, P = .50, 
and for situation B: x? = (49)?/1000 = 2.40, P = .12. 

In both of these situations the difference is not statistically significant. 
Hence, under the above assumption of a very large sample of readers 
who are potential buyers, the hypothesis that the two advertisements 
were equal in pulling power is quite tenable; consequently, Manville’s 
implication that the F.T. copy was really more effective would have to 
be rejected. 

We might reverse the question and ask how large should the difference 
have been in order to produce a significant difference for situations A 
and B. Accepting a value of x? = 6.635 (P = .01) as the lower limit of 
significance, we must solve for a and c, with (a +c) equal to 200 in 
situation A and to 1000 in situation B. Hence, we solve equation IV 
for the value (|a — c| — 1), as follows: 


(V) (ja —c| — 1)? = (a+ ¢) 


After determining the value of a — c, we can readily determine a and c 
respectively, since (a + c) is known. 

For situation A: a = 119 and c = 81 (or a = 59.5% and c = 40.5%). 
For situation B: a = 541 and c = 459 (or a = 54.1% and c = 45.9%). 

Consequently for situation A, if one copy had brought 59.5% or more 
of the responses (and the second 40.5% or less), the difference in the 
respective pulling powers would have been significant. 

Similarly in situation B, if one copy had pulled 54.1% or more of the 
responses (and the second 45.9% or less) the difference-would have been 
significant. 

We have prepared a nomograph based on the relationship of equa- 
tion (V).4 It has been plotted for values of total volume of responses 


* We decided to deal with absolute frequencies in equation V rather than with per 
cents (which are more generally used by advertising men) because the equation in its 
per cent form is much more complicated than in its absolute frequency form. The 
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from 100 to 10,000 and for differences in responses between two copies, 
from 11 to 100. 

Figure 1 is read as follows: When a total number of potential buyers 
exposed to the copies is extremely large, then for any given volume of 
responses (a + c) there will be a range of possible differences in response 
(a — c) to the two copies, some of which will be significant. To deter- 
mine whether the difference between a given set of replies is significant, 
proceed as follows: 

The total volume of responses (a +c) is found on the line to the 
right, and the difference between the frequency of response for the two 
copies is found on the line to the left. By joining these two points with 
a ruler the exact value of P can be read from the middle line. This 
value of P indicates how often a difference as large as (or larger than) 
the one observed could arise by chance. Whien this difference is so large 
that it can arise by chance less than 1 time in 100 (P = .01) the difference 
is regarded as statistically significant. When the value of P or the 
observed difference lies between .05 and .01, the result is doubtful, and 
when P exceeds .05, the difference is regarded as insignificant. For 
example, when the total volume of responses is 200 and the difference in 
replies between the two copies is 19, the value of P found from the nomo- 
graph is about .22 and hence not significant. 

The nomograph of Figure 1 can also be used to determine how large 
the total volume of responses must be in order that a given difference in 
replies will be significant. Thus, when the difference is 50, there must 
be no more than 375 total responses if the difference is to remain signifi- 
cant, since the significance of a given absolute difference decreases as the 
number of replies increases. 

The nomograph of Figure 1 may also be used to determine the minimal 
size of a difference in replies between two copies that will yield a signifi- 
cant difference for a given volume of responses. Thus, for a volume of 
500 responses there must be at least a difference of 58 in the two sets of 
replies, for the result to be significant. 





former may be written as follows: 


(v’) x* = (|p: — ps| — 1/(@ + c) Pa + c) 


In order to take care of the factor 1/(a + c) a double nomograph would have to be used 
instead of the single nomograph which is sufficient for equation (V). Had we neglected 
the 1/(a + c) factor completely we would have overestimated the value of x (the root 
of x* which equals the value of the critical ratio) by as much as 1/Va +c. For 100 
replies, x is overestimated by .1. This is a small enough error, but its size was not 
realized until after the nomographs were drawn. The nomograph in its per cent form 
has since been drawn up and may be obtained from the authors. 
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Nomograph I for determining the significance of the difference in number 
of responses to two contrasted copies when the total number of responses is less than 1% 


of total number of potential buyer-readers. 


Fie. 1. 
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We can also determine the significance of the results obtained from 
combining two independent split-run copy tests. Thus, if we were to 
combine the values of x? given by Nomograph I for the results yielded 
by the New York Times and the News we would obtain the following 
results for situation A and situation B (1). 


Table 2 


Significance of the Difference in Response to the F.T. and D.P. Copies for the Com- 
bined Results of the Split-Run Copy Tests in the Times and the News 














For Situation A For Situation B 
(a + c) = 200* (a + c) = 1000 
x? D.F. x? DF. 
Times .125 1 0.74 1 
News .405 1 2.40 1 
Total x? 525 2 3.14 2 
P .80 .20 





*Had to be computed directly because nomograph did not extend to such low 
values. D.F. represents degrees of freedom. 


Under the assumption that a total of only 200 responses was received 
for the F.T. and D.P. copies (situation A), the combined results for the 
difference in response in both newspapers would arise by chance about 
80 times in 100 and such a difference is not statistically significant. 
Under the assumption that 1000 responses were received (situation B), 
the combined results indicate that chance alone would account for the 
difference 20 times in 100, which is again not significant. 


It should be pointed out here that since the total number of responses is 
taken to be at least 200 for situation A, the number of responses expected by 
chance for each copy is 100. Fisher and Yates (2) have indicated that when 
the smallest expected frequency is not less than 200, the value of x? when 
corrected for continuity will give a good approximation of the true probability. 
However, when the number of total responses falls much below 200 the method 
of x? usually fails to give a good approximation to the true probability. In the 
case under discussion, however, since the two contrasted groups of readers of 
both copies are considered equal in number, the stringency of the above rule 
can be relaxed. It has been found in practice that when the two contrasted 
groups are equal, the method of Chi-square is still applicable even when the 
smallest expected frequency is as low as 5, that is, when the total number of 
responses is only 10.‘ 


4 The reader may wonder whether it is fair to consider only the two columns of our 
2 X 2 table in judging whether the contrasted groups are equal. When we compare 
the two rows, the contrasted groups are far from equal. However, the value of x? is 
independent of whether we compare the rows or the columns. Hence, equality in either 
is sufficient. 

















48 Joseph Zubin and John G. Peatman 





Case II—Replies from Small Samples: Number of returns large com- 
pared to the total number of potential buyers exposed to the copy. 

We can now turn our attention to the second possibility previously 
suggested according to which the number of returns represents a signifi- 
cant portion of the total number of potential buyers exposed to the copy. 
Let us accept a sample of 2000 as such a number, 1000 for the F.T. copy 
and 1000 for the D.P. copy. Then the 200 responses of situation A 
would constitute 10% of the total sample, and the corresponding propor- 
tion for the 1000 replies of situation B would be 50%. We can now con- 
struct Table 3 for situation A, Case II. 








Table 3 
Data for Situation A, Case IT (200 replies from a sample of 2000) 
Potential Buyers Exposed to Copy F.T. Copy D.P. Copy Totals 
Responded 105 95 200 
Did not respond 895 905 1800 
Totals 1000 1000 2000 





Table 3 is constructed as follows: First, Manville’s values of 52.5% 
(or 105 returns for the F.T. copy) and 47.5% (or 95 returns for the D.P. 
copy) are entered in the first row of the table together with the total 
volume of 200 replies. Since the total sample of potential buyers is taken 
as equal to 1000 for each copy, the differences between 1000 and the 
number of returns for each copy are entered in the second row of the 
table. Finally, the marginal totals are entered. 

For Table 3, x? = .45 with a P value of .50, and hence the difference 
in pulling power between the two copies is not significant. 

For situation B, Case II, we obtain the results of Table 4. 








Table 4 
Data for Situation B, Case II (1000 replies from a sample of 2000) 
Potential Buyers Exposed to Copy F.T. Copy D.P. Copy Totals 
Responded 525 475 1000 
Did not respond 475 525 1000 
Totals 1000 1000 2000 





Here again the difference falls short of being significant, x? = 4.8 
(P = .29). 

In other words, the difference in the pulling power of the two copies 
is again found to be non-significant for both situations A and B, even 
when the total sample of exposures is small. 
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We can now reverse the problem and determine when a difference in 
pulling power will be significant if the total sample consists of only 2000 
potential buyers exposed to the copy, 1000 for the F.T. copy and 1000 
for the D.P. copy, and with the total number of returns taken as 200 for 
situation A and 1000 for situation B. r 

Solving equation (III’) for (|a — c| — 1)?, we obtain 





(VI) (Ja —e] - = xa +91 - 2 | 

Since we know the value of a + c, we can solve the equation readily. 
For situation A (200 replies from a total sample of 2000): a — c = 35.56; 
and a + c = 200; a = 118; c = 82. Hence if the total sample of expo- 
sures were only 2000 and the responses to the F.T. copy were 118 or 
more (and for the D.P. copy 82 or less) the difference would have been 
significant. ; 

For situation B (1000 replies from a total sample of 2000): Solving 
equation (VI): a = 530 and c = 470. Hence, if 530 responses or more 
had been received to the F.T. copy (and 470 or less for the D.P. copy) 
from a total sample of only 2000, the difference would have been sig- 
nificant. 

Case III—Results for a Sample of 4000 Exposures. 

Let us now consider situation A and B, but instead of a total sample 
of 2000 potential buyers exposed to the copy let us assume a total of 
4000 such exposures. Paradoxically, the value of x? drops from .45 when 
N/2 = 1000 to .37 when N/2 = 2000 for the 200 replies of situation A, 
and from 4.8 to 3.20 for the 1000 replies of situation B. Both of these 
values of x? are below the critical value of significance (x? = 6.635), and 
hence the difference cannot be considered significant. But in both cases 
the value of x? decreased as N increased. This is perhaps unexpected 
since usually the value of x? increases as N increases or, in other words, 
the significance of a given difference increases as the sample size increases. 
We must remember, however, that our problem is somewhat unusual, 
since for a given number of replies the pulling powers, 2a/N and 2c/N, 
decline as N, the size of the sample, increases and consequently the 
numerator in equation (III) also declines. On the other hand, the de- 
nominator increases as N increases since the expression 1 — (a + c)/N 
will increase as N increases, providing (a + c) remains unchanged. Thus, 
with the numerator decreasing and the denominator increasing, the value 
of the ratio, x?, must perforce decline. 

However, the value of x? cannot decline indefinitely, for as N increases 
indefinitely, the value of x? approaches (|a — c| — 1)?/(a + c) asa limit. 
Therefore, if the limiting value of x? is greater than the value 6.635, 
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increases in the value of N will not reduce the significance of the differ- 
ence, but decreases in N will increase the significance of the result. Oh 
the other hand, if the limiting value of x? is less than 6.635, a decrease 
in N of sufficient size may produce a significant difference. 

Case IV—The Maximum Size of Sample that will yield a Significant 
Difference for a Given Number of Replies. 

Since we have determined that the value of x? decreases for a given 
number of replies as N, the size of the total sample, increases, we may 
now reverse the problem and inquire how small N must be in order to 
produce a significant difference between the pulling power of two adver- 
tisements in a split-run test. 

(VII) Solving equation (III) for N, we obtain 











(a + c)? 
N= 2 
e+ - Boas 
x 
or 
: 8 _ (a —e| — 1) 
(VII’) 1/N = 1/(a + ¢) @toxs 


Applying this formula we find that N is 213 for A. Hence when 
the total volume of replies is approximately 214, or 107 for each copy, 
the results are significant in favor of the F.T. copy. When the volume 
increases above 214 the results are no longer statistically significant. But 
what happens when the sample falls below 214? Obviously it cannot 
fall indefinfely below 214 since the total sample can never be less than 
the total volume of responses. In fact, the minimum value for the 
sample size is twice the number of responses obtained from the copy with 
the larger number of responses. This limiting situation is the one which 
produces maximum differentiation between the two copies. This opti- 
mum circumstance is obtained when all the readers of one copy respond 
(a hypothetical situation which will hardly ever occur). In the case of 
situation A, the optimum circumstance for a possible significant differ- 
ence is shown in Table 5. 








Table 5 
Optimum Circumstance for a Possible Significant Difference in Situation A 
(200 replies) 
F.T. Copy D.P. Copy Total 
Responded 105 95 200 
Did not respond 0 10 10 


Total 105 105 210 
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In Table 5 there is a total of 210 readers of whom half, 105, read the 
F.T. copy and half the D.P. copy. Of the 105 who read the F.T. copy, 
all responded ; while of those who read the D.P. copy, 10 failed to respond. 
The value of Chi square for the above table is 8.51. Thus, when the 
volume of responses is 200, of which 105 come in response to the F.T. 
copy and 95 to the D.P. copy, we may conclude that the F.T. copy is 
superior in pulling power to the D.P. copy as long as the total sample size 
lies between 210, the minimal possible size, and 213. When the total 
sample goes much beyond 213, the difference is no longer statistically 
significant. 


For situation B (1000 replies) the optimum result is shown in Table 6. 














Table 6 
Optimum Circumstance for a Possible Significant Difference in Situation B 
(1000 replies) 
F.T. Copy D.P. Copy Total 
Responded 525 475 1000 
Did not respond 0 50 50 
Total 525 525 1050 








x? = 50.42 and hence tlie difference in the pulling power of the F.T. 
and D.P. copy of Manville’s Daily News split-run test would have been 
statistically significant if the total sample of potential buyers exposed to 
the copy were only 1050 and the total number of replies received were 
1000. Such a large number of replies as 1000 may very well be unlikely 
for the one insertion of the advertisements. In any event, it is even 
more unlikely that the more than two million circulation of the Daily 
News included not more than 1050 readers who were potential buyers of 
Polident. 

We can now ask how large may N grow above 1050 before the differ- 
ence becomes non-significant. Substituting in equation (VII), 





2 
No —T, = 1506 
1000 — 49)" 
6.635 





Hence, when N is 1566, or 783 readers for each copy, the difference 
between 525 responses for the F.T. copy and 475 for the D.P. copy will 
still be significant. However, when the sample exceeds 1566, the 
difference in pulling power of the two copies becomes statistically doubt- 
ful. In other words, as long as the total number of readers (evenly 
divided between the two copies) lies between 1050 and 1566, the differ- 
ence in pulling power for the 1000 replies of situation B is significant. 
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Fie. 2. Nomograph IIA for determining the number of potential buyer-readers 
(N) required to produce a significant difference for a given total frequency of response, 
(a +c), and a given difference in frequency for two contrasted copies, (a — c). 
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When N increases to about 1566, the difference begins to fall below the 
significance level. Thus, when we have some estimate of the number of 
readers exposed to split-run copy, we can determine for a given set of 
replies whether or not the difference in pulling power is significant. 

Two nomographs (IIA and IIB) have been developed which will 
indicate the maximum value of N that will yield a significant difference 
between observed results. These are presented in Figures 2 and 3. 
They are based on equation (VII’). Letting y = (a + c)***/(|a — c| — 1)’, 
equation (VII’) becomes: 


- (VIT") 1/N = 1/(a +c) — lly 


We can draw up a nomograph for determining the value of y from the 
values of (a + c) and (a — c) for the situation when x? = 6.635(P = .01). 
This nomograph (IIA) is shown in Figure 2. 

The left hand column gives the total number of responses (a + c) 
and the right hand column gives the difference in responses between the 
two copies (a — c). The middle column gives the value y which is to 
be used in Nomograph IIB. A ruler connecting the two points for 
(a +c) and (a — c) respectively will cut the middle line at the desired 
value of y. Having determined the value of y, we enter with this value 
of y on the line to the right in the Nomograph IIB and by connecting the 
point y on that line by means of a ruler with the point for the total 
number of responses (a + c) on the line on the left, we can read off in the 
center the value of N required to render the difference significant. 

Example for Nomographs IIA and IIB: When the total number of 
responses to both copies is 1000 and the difference in responses between 
the two copies is 50, we enter Nomograph IIA in the left hand column 
marked (a + c) with the value of 1000 and connect this point with the 
ruler to point 50 on the right hand scale and read off the value of y on 
the middle scale as 2650. With this value of y we enter the right hand 
scale of Nomograph IIB and connecting it up with the point correspond- 
ing to 1000 on the left scale, we read off the value of N = 1500 on the 
middle scale. That is to say, when the total number of responses is 
1000 and the difference in response between the two copies is 50, a total 
sample of not more than 1500 potential buyer-readers would be required 
to make the difference significant. 

Nomographs IIA and IIB can be used in any one of four ways. 
First, to determine the maximum value of N that will yield a significant 
difference for a given set of replies, as was done in the example above. 
Second, to determine that value of a difference in replies (a — c) which 
would be significant for a given size sample (N), and a given volume of 
replies (a +c). Third, to determine the volume of responses (a + c) 
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which would yield a significant difference for a given difference in replies 
(a — c) and for a given sample size (N). Fourth, to determine the 
significance of the observed difference by obtaining the value of x? 
directly. In order to utilize the nomograph for this purpose, only one 
computation is required.’ First, determine from Nomograph IIB the 
value of ys corresponding to the observed values of (a + c), the total 
number of replies and (N) the total sample. Secondly, determine the 
value of y from Nomograph ITA which gives the value of ys corresponding 
to the observed number of replies (a + c) and the difference in replies 
(a —c). The desired value of x* is obtained from the following formula: 


2 = % (6.635) 
YB 
Knowing the value of x*, we can determine P from any table of x’. 


Discussion 


Thus far we have dealt only with the statistical aspects of the split- 
run copy technique. However, it is to be emphasized that there are 
certain assumptions which the technique must satisfy before the statis- 
tical method becomes applicable. These assumptions are: 


(1) That the two copies of the advertisement have been so distrib- 
uted among two groups of readers that these groups constitute random 
samples of the population under examination—randomly divided at least 
insofar as the product being advertised is concerned. 

(2) That the two groups contain an approximately equal number of 
potential buyers of the product. (The method can also be applied when 
the potential number of buyers of the two groups is not equal; in such 
case, however, their relative number must be known.) 

(3) That a valid estimate of the maximum size of the sample of 
potential buyers is available. It should be borne in mind that no amount 
of statistical refinement can compensate for any gross error in the original 
estimate of the size of the samples under comparison. 


5A more direct way of accomplishing this purpose is to note the fact that Nomo- 
graph IIA is based on the value of x* = 6.635 as the critical point. This value of x* 
is found on the y scale opposite the value 31.6 = (V¥1000) on the (a + c) scale. (The 
decimal point has to be provided by the reader.) By providing a duplicate y scale and 
moving up or down we can let the critical value of x* vary from 6.635 to any other 
value we please. After determining the value of y for a given observed result by means 
of Nomograph IIB, we can enter Nomograph IIA and let the duplicate scale move up 
or down until the value of yg lies under the ruler connecting up (a + c) with (a — c). 
The value of x* for the observed comparison can then be read off at the critical point 
(corresponding to 31.6 on the (a + c) scale), with proper care for the decimal point. 
The corresponding value of P can be obtained from standard tables. 
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(4) That the act of clipping the coupon at the end of the copy is 
indicative of interest in the product, and furthermore, that this act was 
brought about as a result of the reading of the copy, and not as a result 
of any other factor. 


When two contrasted copies are found to yield results which are 
significantly different, care must be taken not to place too much reliance 
on this fact alone. The statistical procedures described serve to indicate 
whether a difference in a given set of replies is or is not greater than 
would be expected on the basis of chance alone. The attributing of a 
superiority in pulling power to one copy over the other must be but- 
tressed by evidence of a logical or a psychological kind. If the result 
flaunts common sense, the test should be repeated until it is verified 
beyond any doubt. 

If it stands up under repetition, it would be well to investigate the 
causes of the differences before the superiority of one form over the other 
is regarded as established. Simple methods for combining the results of 
two or more tests are presented by Fisher (1) and an example has been 
worked out in this paper. 


Application to Other Problems 


The nomographs described in this paper are useful in other situations 
besides the split-run copy technique. They will be found useful in item 
analysis of psychological tests when the two contrasted groups of suc- 
cesses and failures are equal. For this purpose, Nomographs IIA and 
IIB should be used. The advantage of these nomographs over previous 
ones inheres in the fact that the significance of an item can be read off 
directly from the nomographs once the values of (a — c), (a + c) and N 
are known. 

Whenever the incidence of rare events is contrasted in two groups, 
Nomograph I ¢an be used to determine the significance of the difference 
in the incidence of the event, if the two contrasted groups are very large 
and equal in size. Examples of studies in which many rare events or 
characteristics are compared in two contrasted groups are quite plentiful. 
Among these are studies in the incidence of rare diseases and accidents 
in different racial or industrial groups. Comparison of the incidence of 
rare words in contrasted authors or manuscripts is another example. 
Nomograph I offers only an approximation to the value of the exact 
probability and should be useful as a screening device for selecting the 
comparisons that need further study with the exact methods developed 
by Fisher (1) or by means of the Poisson distribution which is suitable 
for the evaluation of the statistics of rare events. 
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Summary 


In determining the relative pulling power of two different forms of 
the same advertising copy, a method known as the split-copy testing has 
been widely used. Heretofore only rule of thumb methods have evi- 
dently been widely used in the evaluation of the results of this technique. 
The present article develops simple formulae which are modifications of 
the basic Chi-square formula for equal contrasted groups (marginal fre- 
quencies in the columns, or in the rows). One of the difficulties charac- 
teristic of many studies in this field is the absence of any precise data 
regarding the actual size of the number of readers of the advertisements 
in question who are potential buyers of the article advertised. Conse- 
quently it becomes necessary to indicate maximal and minimal sizes of 
samples which will yield statistically significant results for a given set of 
returns. Three nomographs are presented for eliminating all computa- 
tions and obtaining the required answers directly. These nomographs 
can be used for any situation in which a comparison is made between the 
frequencies of an event (a reply, in the case of advertising copy tests) 
for two equal contrasted groups. Nomograph I is to be used when the 
frequency of the event under investigation is small (less than 1%) as 
compared to the total sample. Nomographs IIA and IIB are to be used 
when the frequency of the event is considerable (higher than 1% of the 
total sample). The use of these nomographs has the advantage of re- 
quiring no computations whatsoever, the significance of the results being 
determined directly from the raw data. 

An example from the literature of a duplicated split-run test for com- 
paring the pulling power of two advertisements was carefully analyzed. 
The results were found to be statistically non-significant for each test 
separately as well as for both tests taken in combination. 

Received January 3, 1944. 
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Relationships Between Strong Vocational Interest Scores 
and Other Attitude and Personality Factors 


Leona E. Tyler 
University of Oregon 


At the present stage in the progress of vocational interest research, 
the most challenging problems are those concerned with the nature of 
the characteristics¥which differentiate one occupational group from 
another. A vast mass of information has accumulated with regard to 
empirical differences between groups and relationships between different 
interest scores (4). So far only a few studies have attempted to clarify 
and explain these relationships or to fit the traits under consideration 
into a general theory of personality. 


Purposes of This Study 


The chief aim of the present study was to analyze in some detail 
relationships between scores on the Strong Vocational Interest Blank and 
several measured attitude and personality factors. This involved going 
behind the correlations to discover, if possible, why the scores correlated. 
It was hoped that such a procedure would throw light on the nature of 
differential occupational interests and would also add to our understand- 
ing of personality organization. It is to be remembered that the interest 
tests are practically the only so-called personality tests which have any 
objective outside reference. We know that interest scores reflect choices 
people actually make and plans of action they actually carry out. To 
be able to relate other measured characteristics about which we know 
less to this reference point would have some real advantages. For the 
author, this study is related also to plans for a longitudinal investigation 
of the development of interests in children. Exploration of the relation- 
ships obtaining for adults shows what characteristics are important to 
observe in a genetic study. 


Procedure 


The subjects were college sophomores taking psychology laboratory 
at the University of Oregon in the fall term of 1941-1942, 55 men and 
122 women. The battery of tests consisted of the Strong Vocational 
Interest Blank, the Minnesota Personality Scale with its five sub-tests 
measuring Morale, Social Adjustment, Family Adjustment, Emotional 
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Adjustment, and Economic Conservatism, and the Thurstone and Chave 
scales, Attitude toward the Church and Attitude toward God (No. 22, Form 
A). The Strong tests were scored on the following group scales: Group I. 
Human Sciences; Group II. Technical Sciences; Group V. Personnel or 
Social Service; Group VIII. Office Work; Group IX. Sales; and Group 
X. Verbal-Linguistic. 

The first step in the analysis was the computation of correlations 
between each of the interest scores and each personality and attitude 
variable, keeping the data for men and women separate. These coeffi- 
cients were checked for statistical significance by Fisher’s ‘‘t.’”’ All rela- 
tionships for which the probability of significance was more than .95 
were selected for further analysis. This consisted of an item analysis of 
the personality measurement under consideration. We singled out the 
individuals receiving highest and lowest scores on the interest scale. In 
the men’s group, we compared the 20 highest with the 20 lowest. In the 
women’s group we used the 33 highest and the 33 lowest. The person- 
ality test papers of these individuals were removed from the rest of the 
pile, and the responses made by high and low groups to each item were 
tabulated. On the Minnesota Test, where the respondent marks each 
item 1, 2, 3, 4, or 5, the mean value of the responses in each group was 
obtained, and the difference between means of high and low groups 
checked for significance by Fisher’s “‘t.”” On the Thurstone scales, where 
the response consists of checking or not checking each item, simple per- 
centages were obtained and checked for statistical significance. By this 
method, personality items which were clearly related to interest scores 
were sorted out from those which were not. 

The next step, for tests in which there turned out to be enough of 
such items to make the task worthwhile, was to rescore the papers by a 
key which included only the discriminating items. The correlation of 
interest score with this special score based on selected items furnishes a 
better estimate of the relationship of interest and personality factors 
than does the original correlation. For the most significant relationship 
obtained in this way, a further check was made. A group of 38 cases, 
not used in the item analysis, was chosen from the author’s counseling 
files. The interest-personality correlations were calculated for this group 
also. 

One problem arose in the design of this investigation, the solution of 
which was not entirely satisfactory. If the women’s form of the Strong 
Blank were used for women subjects, group scores could not be obtained. 
If the men’s blank were used, there would be some question as to the 
meaning of the women’s scores. Since there is considerable evidence 
that meaningful scores for women can be obtained with the men’s blank, 
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we decided to use it, watching for any indication of sex differences in the 
results. Unfortunately, only about a third of the subjects were men, 
‘so that the most unequivocal results rest on the smaller number of cases. 


Results 


Male Subjects. Correlations between interest scores and other per- 
sonality variables are shown in Table 1, which reveals several interesting 
facts. First, the Social Adjustment test is the only one from the Minne- 
sota Personality Scale which shows more than one significant correlation, 
with Strong scores. The Family and Emotional tests do not correlate 


Table 1 


Correlations between Strong Group Scores and Personality and 
Attitude Scores, N = 55 Men 





Personality Tests 





Ec. Attitude Attitude 
Morale Social Family Emot. Cons. God Church ! 





Group I 
Human Science —.24 — .32* .00 —.14 —.29* —.26* 4t 
Group II 
Technical — .03 — .35t 27 -06 -—.0l — .32* .32* 
Group V 
Personnel 17 .25 -.12 -—-.12 —.06 .04 — .29* 
= Group VIII 
™ Office .23 14 — .03 .09 .24 17 — .29* 
Group IX 
Sales .03 A0t .09 .22 06 .38t —.25 
Group X 
Verbal — .34t 0 —-23 —-10 —-19 =—.02 .23 





1 High scores indicate unfavorable attitude. 
* Significant at 5% level (Fisher’s “‘t’”’ ss 
t Significant at 1% level. 


with anything. Morale is negatively related to Verbal interest scores 
and Economic Conservatism is negatively related to Human Science interest 
scores. Otherwise, correlations involving these two sub-tests are too low 
for significance. Second, correlations of the Social score with Groups I 
and II, on the one hand, and Group IX, on the other, are opposite in sign 
and approximately equal in size. Third, one or both of the religious 
attitude scales correlate significantly with all Strong scores except 
Group X. The directions of these relationships are consistent. Less 
religious attitudes go with science scores, more religious with personnel 
and business. 
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Item analysis of the relationship between Group X and Morale turned 
up very little that was interesting. Only three items gave significant 
“t's.” These are: 2. Joys of family life are much over-rated; 9. A high 
school education is worth all the time and trouble it requires; and 39. A 
good education is a great comfort to a man out of work. The correla- 
tion in this case seems to reflect simply a slight tendency for men with 
high verbal interest scores to be less optimistic about things in general. 
Practically all the items showed this trend, but in none except these three 
was it marked enough to result in a significant ‘‘t’’ (5% level). A pessi- 
mistic slant in Group X men may be attributed to either the higher 
intelligence or the larger number of dislikes on the Strong blank itself 
which is known to characterize many men in this group. 

Similarly, item analysis of the relationship between Group I scores 
and Economic Conservatism was unfruitful. Only two items showed sig- 
nificant differences: 186. If our economic system were just, there would 
be much less crime; and 193. A man should be allowed to keep as large 
an income as he can get. Here too most of the other items showed a 
tendency for individuals with high Group I scores to respond less con- 
servatively, but differences were slight. Such a trend may again reflect 
nothing more than the slightly superior general intelligence usually asso- 
ciated with high Group I scores. 

It was in the item analysis of the Social subtest that the most inter- 
esting findings appeared. Thirteen items out of the sixty-five showed 
significant differences between high and low groups on one or more of the 
three interest scales with which the scores were correlated. A scrutiny 
of these items indicates that they have in common one rather limited, 
specialized phase of an individual’s social personality—namely, his atti- 
tude about the desirability of friendship with many people. The four 
items on which differences are most pronounced for all three interest 
groups under consideration are: 55. Do you like to meet new people?; 
72. Do you get along as well as the average person in social activities? ; 
84. Do you like to know a great many people intimately?; and 88. Do 
you prefer to participate in activities leading to friendships with many 
people? To each of these questions, the sales group gives a much more 
social answer than the two science groups. Items concerned with other 
attitudes toward people and social affairs do not differentiate. Self- 
consciousness, embarrassment, unhappiness about one’s blunders, lack of 
aggressiveness, failure to get along with people—none of these has any 
demonstrable relationship to the interest factors differentiating salesmen 
from scientists. The social characteristic involved here is a matter of 
preferences rather than adjustment. 

When the Social test was scored by a key made up of only the 13 
differentiating items, the results shown in Table 2 appeared. There 
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seems to be little question that the social trait tapped by the 13 dis- 
criminating items is related more closely to scientific and sales interests 
than is social adjustment in general. The most conclusive coefficients 
are those based on the cases from the files. While the Group I r is lower 


Table 2 


Correlations between Strong Scores, Groups I, II, and IX, and Scores on Scales for 
Measuring Social Adjustment or Attitudes 








Complete 13-Item 13-Item 
Social Scale Scale Scale 
(Orig. Group) (Orig. Group) (Cases from Files) 
= 55 = 55 N = 38 
Group I — .32 — .57 —.44 
Group II — .35 — .56 — .52 
Group IX 40 58 .66 





for this group than for the original subjects, Group II is about the same, 
and Group IX is actually higher. The full importance of these correla- 
tions becomes apparent when we realize that on the basis of the 13 social 
items alone we could predict Strong scores on Groups I, II, .and IX with 
about as much accuracy as we predict school marks from intelligence test 
results. 

The item analysis of the Attitude toward God scale showed, as one 
might expect, that the principal differences stemmed from a tendency of 
the scientifically-minded to avoid the more mystical positions, such as 
“God is the underlying reality of life.” There was no difference on anti- 
religious items; few if any of these subjects checked them. On the 
Attitude toward the Church scale, the principal difference between scien- 
tists and personnel and business men seemed to be a greater tendency 
toward skepticism among the former about whether or not the church 
produces all the good effects claimed for it. Differential responses to 
such items as “I feel that church attendance is a good index of the 
nation’s morality” are typical of this trend. On this scale also, the anti- 
religious items were avoided by everybody. The difference between 
scientists and business-personnel men might be described as a difference 
in amount of indifference, not opposition, to the church. 

The last step of the procedure, scoring the test on discriminating items 
alone and recalculating the correlations, was carried out for the Altitude 
toward the Church scale. (There were too few items in the other tu make 
it feasible.) The outcome is shown in Table 3. Again the correlations 
are larger with the scores based on discriminating items alone, but the 
increase is less marked than for the social variable. Since on scales of 
this type subjects check only those items with which they agree, it is 
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probable that irrelevant items carry less weight in the original score 
than they do where the other scoring technique is used. 

A possibility that a few items on the Strong test that happened to be 
closely similar to the social items we have sorted out might account for 


Table 3 


Correlations between Strong Scores, Groups I, II, V, VIII, and [X, and Scores on 
Scales for Measuring Attitudes toward the Church. N = 55 








Complete 
Att. Church 19-Item 
Scale Scale 
Group I .34 46 
Group II 32 40 
Group V —.29 — .28 
Group VIII —.29 —.40 
Group IX — .25 — .36 





the obtained correlations also needed to be checked. An item analysis 
in the reverse direction was used for this purpose. That is, the responses 
on the Strong test of the 20 individuals who were highest on the 13-item 
social scale were tabulated and compared with the responses of the 20 
lowest. Differences showed up on a large number of interest items scat- 
tered throughout all parts of the blank in such diverse areas as, for 
instance, liking geography and handling horses. This suggests that a 
different general outlook differentiates these people and not just a few 
specific preferences. 

Another question of some significance is whether or not social and 
religious attitudes are correlated. This seems to be the case. The corre- 
lations between the special social scale and the special church scale is .40. 
Thus interest scores could be predicted from a combination of the two 
with only slightly more accuracy than from the social score alone. 

Female Subjects. For the 122 women in the study, correlations be- 
tween Strong scores and other personality variables are shown in Table 4. 
There are more statistically significant r’s than for the men’s group 
because the larger number of cases makes a smaller coefficient significant. 
Numerically, most of the coefficients are somewhat lower than those in 
the corresponding cells of Table 1. It is interesting to note that all the 
significant relationships are in the same direction for women as for men. 
The principal difference between Tables 1 and 4 is with regard to the 
religious attitude scales. In Table 4 only one of the relationships be- 
tween religious attitudes and interests is significant and that one (Group 
V vs. Attitude toward God) is very low. Group V interests, which for 
men are most closely associated with favorable attitudes toward the 
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Table 4 


Correlations between Strong Group Scores and Personality and 
Attitude Scores. N = 122 Women 





Personality Tests 





Ec. Attitude Attitude 
Morale Social Family Emot. Cons. God Church 





Group I m 
Human Science —.25** —.27** —.15 —.20* —.12 01 .06 
Group II 
Technical —.14 —.27** —.08 — .07 -.11 —.17 14 
+ Group V 
£ Personnel .23°* 34%* —01 14 —.19% 19% —.05 
= Group VIII 
™ Office .19* 05 14 07 .08 04 — .06 
Group IX 
Sales 16 38** =.13 12 a —.15 
Group X 
Verbal -09 -.0l -07 -07 -06 —.05 Al 





* Significant at 5% level (Fisher’s “‘t’’ test). 
** Significant at 1% level. 


church, are in women most clearly related to social characteristics and 
morale. 

Item analysis of scales other than the Social again selected only 
scattered items, difficult to fit into a coherent pattern. But out of the 
53 items on the Social test (Women’s Form) 41 differentiated high from 
low people on one or more of the interest scales. A great many of these 
are significant for Group V only. Women high in personnel interests 
give more extraverted responses than others to almost all types of ques- 
tion. A smaller set of 19 items differentiated scientific and sales interests, 
as did the set in the men’s data. There is more of a tendency, however, 
for items pointing toward social maladjustment to differentiate between 
these interest groups of girls. For instance, the item, ‘‘Do you find it 
easy to make friendly contacts with members of the opposite sex?,” has 
no differentiating value for men, but shows up significantly in the analyses 
of four scales for women (Groups I, I, V, and IX), the more social answer 
characterizing girls with personnel and sales interests, the less social, girls 
with science interests. The results of rescoring the blanks by the 19- 
item key and recalculating correlations are shown in Table 5. Coeffi- 
cients are increased by eliminating non-discriminating items, but not so 
markedly as is true for men. This also may indicate that items having 
to do with social adjustment and happiness are less irrelevant to interests 
for women than for men and thus have a less depressing effect upon 
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correlations. Unfortunately, relationships in the women’s data could 
not be checked in a new group, as the counseling files contained too few 
cases of women who had taken the men’s Strong test. 


Table 5 


Correlations between Strong Scores, Groups I, II, and [X, and Social Scores on 
Minnesota Personality Scale. N = 122 Women 











Complete Special 
Social 19-Item 
Scale Scale 
Group I —.27 — Al 
Group II —.27 — .37 
Group IX 38 A7 
Discussion 


The facts outlined above can be brought together under a few main 
headings. First, it is to be noted that interest scores on Group I, Group 
II, and Group IX scales are the only ones showing consistent significant 
relationships to other personality factors. We have uncovered nothing of 
any importance about the other group scales, except that women’s 
Group V interests tend to accompany higher socialization and morale 
and men’s Group V and Group VIII interests go with a relatively favor- 
able attitude toward the church. The scientific and sales interests are 
at opposite poles in all the relationships investigated in this study. 
Many of Strong’s reported results show the same trend (4). 

The second generalization is that there is a social factor related to 
interest differences. This factor, for men, is centered around the sort 
of social stimulation they prefer and seek. The more a man tends to 
avoid large numbers of acquaintances and indiscriminate social affairs, 
the more likely he is to show the interests of scientific men. The more 
satisfaction he takes in social affairs involving large numbers of people, 
the more likely he is to resemble salesmen in his interests. Items having 
to do with maladjustments and unhappiness do not differentiate between 
men’s interest types. Women show the same general sort of difference 
between persons with scientific and sales interests, but the nature of the 
social factor is a little less clear-cut. The same items concerned with 
liking or avoiding social affairs constitute its important component, but 
some other less-easily-catalogued items also differentiate. Girls with 
high scientific interests are likely to be somewhat less well-adjusted to 
the opposite sex, somewhat less confident and happy in their social rela- 
tionships than girls with other types of interest. This may be because 
in the direction their interests are taking they are departing rather mark- 
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edly from what society expects of women and are thus introducing some 
strain into their relationships with people. It is impossible to determine 
from the data at hand which of these variables is cause and which effect. 
A genetic study should throw light on these doubtful issues. 

Postulation of an important social factor around which other atti- 
tudes cluster suggests hypotheses explaining several puzzling facts turned 
up from time to time in interest research. Many have wondered, for 
instance, why artists should be classified with physicians, psychologists, 
architects, and dentists, by the correlations between interest scores. 
A similarity in social outlook might be the basis for these correlations. 
The predominance of scientific interests in adolescent boys is another 
such research finding. The social attitudes of 15-year-old boys, if they 
carry other attitudes along with them, could account for this. The 
correlation of interest scores with success in selling life insurance might 
also involve this third factor. If the high-scoring men have a different 
social outlook, that could easily account for their selling more insurance. 
All these hypotheses could be checked statistically without a great deal 
of trouble. 

Another general finding from this study seems worthy of mention,— 
the general lack of relationship between interest scores and what we 
loosely term “neurotic tendency.”’ There is no correlation for either sex 
between interests and family adjustments. There is only one low sig- 
nificant correlation between interests and emotional adjustment, and this 
shows no particular trend in the item analysis. None of the items on the 
social scale referring to relatively serious maladjustments or neurotic 
symptoms differentiate in any interest group. It may well be that there 
are two independent variables in personality, the direction which it takes 
and the success or effectiveness of the adjustment made. We have per- 
haps given too much attention to adjustment in personality study, and 
too little to this other consideration of the direction of development. 
There is certainly no evidence here to suggest that neurotics tend to pile 
up in certain occupations. 

That correlations with religious scales are significant for men but not 
for women may indicate that vocational interests are a more funda- 
mental factor in male than in female personalities in our culture, more 
closely integrated with the individual’s whole cosmic orientation. As in 
most studies, our women subjects averaged slightly more religious than 
the men, but the differences were not statistically significant. It is to 
be remembered that the differences in men are not on the anti-religious 
items but on those showing skepticism. The tendency one might expect 
to find for business men to be more conservative than scientists shows 
up more plainly with regard to religious issues than in regard to economic 
opinions. 
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Finally it should be noted that the results presented in this paper 
coincide in several respects with some other exploratory studies. The 
table given by Darley (2) shows the largest number of significant critical 
ratios for the scientific interest groups and the social scales, particularly 
Social Preferences. Sarbin and Berdie (3) find more significant differ- 
ences in Allport-Vernon scores for Groups I and II than for the others. 
Tussing (5) finds that scoring keys for the Strong blank can be con- 
structed to measure somewhat the same factors as the Bernreuter F-1C 
and F-28 scales and the Bell Social scale, but not the other Bell scales. 
It would be possible to fit the results of this study into Bordin’s (1) 
theory that “in answering a Strong Vocational Interest Test, an indi- 
vidual is expressing his acceptance of a particular view or concept of 
himself in terms of occupational stereotypes,” if we assume that attitudes 
toward social affairs and friendships are important features of the stereo- 
types for some occupations. 


Summary 


1. A social factor having to do with feelings about organized social 
affairs and friendships with many people is related significantly to Group 
I, Group II, and Group IX scores on the Strong Vocational Interest 
Blank. 

2. Religious attitudes show a moderate correlation with vocational 
interests in men but not in women. 

3. Scores indicating neurotic tendencies show no appreciable relation- 
ship to any kind of interest scores. 


Received December 1, 1943. 
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A Worry Inventory 


A. H. Martin 
The University of Sydney, Sydney, Australia 


This inquiry originally constituted an endeavour to supply an alter- 
native form to the usual questionnaire used as a measure of maladjusted 
trends in personality. In place of presenting questions requiring the 
underlining of answers either “‘yes” or “‘no,” direct but brief description 
of symptoms were set out. Test subjects were first required to underline 
those items which had ever worried them. When this task was com- 
pleted, they were further required to make a ring around the number of 
any of the underlined items which at present still constituted a cause of 
worry. ‘Two scores were thus obtained; the first consisted of an “effec- 
tive” score of worry items; the second score noted existent present 
worries. In administration this inventory proved to be far easier for 
candidates to answer than the usual questionnaire form of personality 
inventory. 

The questionnaire generally used for vocational guidance work in 
Sydney, Australia, for candidates from fifteen to twenty-five years of 
age is Thurstone’s ' selected questions, modified by a few necessary local 
emendations and the addition of three “jokers,” i.e., catch questions to 
which the better answers are ‘‘Yes.” 

Regularly each year some undergraduates have commented on the 
difficulty of following the directions for this questionnaire. The effort 
of fine discrimination required to arrive at honest decisions, appears to 
be great. At the same time, the questionnaire does actually sort out 
many cases showing inferiority trends, shyness or “‘nervousness.’”’ These 
individuals may be directly helped by an interview or two involving a 
brief analysis of the subjects’ self-centered attitude. There were 42 
questions in the original Sydney inventory and the average number of 
“yields” per person was from 10 to 11 questions. Self-confident indi- 
viduals of a good “sales” type tend to register from 3 to 6 yields. Occa- 
sionally types with pronounced “inferiority” or neurotic trends show 
from 15 to 35 question yields. A Factorial Analysis * shows the ques- 

1 Thurstone, L. L., and Thurstone, T. G. A neurotic inventory. J. soc. Psychol., 
1930, 1, 3-30. 


*Gibb, C. A. Personality traits by factorial analysis. Aust. J. Psychol. Phil., 
June, 1942, 1-15. 
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tions themselves tend to cluster about such trends as schizoid, submissive, 
a-social and manic-depressive. 


Worry Inventory 


Since the general principles involved in the Thurstone type of ques- 
tionnaire had proved so useful for many years, it seemed advisable to 
retain the principle involved and yet to seek some simplification of the 
questionnaire method. Accordingly the direct method was decided on 
for a try-out. Every existing list of worries, patterned after Wood- 
worth’s * original lead, tends to be cast in the form of questions. But 
this method is neither economical in its setting out nor is it often easy 
for the candidate to answer. The itemized list, used here, however, 
directly denominates each symptom, which has been reduced as far as 
possible to very simple and unambiguous terms. The method of marking 
the items was first to get subjects to underline those which had ever 
caused them worry. After completing the list thus, they were required 
to put a ring around the number of each item which was still a present 
cause of worry or distress. Two individual scores were thus secured. 
If the items are honestly marked, the first indicates the general nature 
of individual worries; the second score is indicative of the nature and 
extent of existent worries. This procedure of underlining and ringing 


was borrowed from the method prescribed in the Pressey Emotional 
X.0. Test. 


The Group Tested 


The original group of individuals tested consisted of undergraduates 
attending classes in Psychology in years I, II or III in the University of 
Sydney, together with a few outside persons attending one of these single 
courses. About fifty per cent of the undergraduates were students en- 
rolled for evening courses. Altogether 100 persons were used as subjects, 
48 being females and 52 being males. Their ages varied from 17 to 47 
years, but gave a preponderance to subjects of earlier years with a very 
decided skew towards the higher years. The average age for both sexes 
combined was 20.4 years. In order to enable observations to be made 
upon age trends, the scores were separated into male and female results 
under the following age divisions: (a) 17 to 19 years, (b) 20 to 21 years, 
and (3) over 21 years. No differentiating trends in age or sex appeared, 
either in yields to particular types of items, in actual numbers or in the 
proportion of existent worries to general worry problems. The six sub- 
groups, though fairly uniform in trends, were not large enough for any 
reliable inferences to be drawn. Some degree of selection must have been 


* Franz, 8.1. Handbook of mental test methods. New York: Macmillan Co., 1920. 
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operative in the male section with respect to military service, hence this 
sampling can hardly be regarded as “average” and fully representative. 
In the female section no such factors of selection probably came into play. 

The subjects did not record their names on the sheets unless they 
wished, but merely indicated sex and age. 


The Present Inventory 
The items with their two results given alongside, are shown in Table 1. 
The items were arbitrarily selected by the writer on a representative 


Table 1 


Items in the Worry Inventory with Percentage of “Yields” for 
100 University Students 





Percentage 
of Yields 
Il ** 





. Poor health 

Being different in appearance to other folk 
Having a poor appetite for food 

Bad indigestion 

Lying awake at night 

Disagreeable dreams or nightmares 
Constant aches or pains 

. Night sweats 

. A dislike for certain kinds of food 
10. Spells of dizziness 

11. Thoughts of death 

12. Being nervous or shy 

13. Sudden heart tremors or palpitations 
14. A tired feeling after waking 

15. Frequent headaches 

16. Entertaining folk 

17. Thoughts of suicide 

18. Being found fault with 

19. Not taking part in sports or games 
20. Sleep-walking 

21. Being too quarrelsome 

Loneliness 

Bad asthma or shortness of breath 
Feeling bored or “fed up” with life 
Sexual problems 

Mind wandering or day dreams 
Constant bad luck 

Being teased or made a fool of 

Not being understood or appreciated by people 
Meeting members of the opposite sex 
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A Worry Inventory 
Table 1—Continued 





Percentage 
of Yields 


Il** 


5 
12 
2 
7 
6 





Being treated unfairly by others 

Leaving tasks unfinished 

Getting no pleasure out of life 

Talking too much 

Lack of success in your studies 

Not being loved at home 

Addressing groups or meetings 

Being closely watched or observed 

People’s wickedness 

Being a failure in your job 

Feeling self-conscious 

Brooding over your sins 

Having wrong thoughts 

Going into tunnels or subways 

Not being your real self 

Some speech difficulty such as stammering or lisping 
Looking down from a height 

The need to do things over and over 

Things seeming unreal to you 

Talking in your sleep 

Not being able to converse easily with people 
Making one of a crowd 

Your lack of true friends 11 
Blushing 37 
Lack of self-confidence 34 
Twitching of the face, head, body or limbs 7 
Your religious beliefs 19 
Some vague constant fear or worry 14 
. A fear of going insane 7 
. Inability to fix your attention on books, work orstudies 28 
Having a bad temper 21 
Being afraid 14 
. Meeting with or talking to elders or social superiors 20 
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31. 
32. 
33. 
34. 
35. 
36. 
37. 
38. 
39. 
40. 
41. 
42. 
43. 
44. 
45. 
46. 
47. 
48. 
49. 
50. 
51. 
52. 
53. 
54. 
55. 
56. 
57. 
58. 
59 

60 

61. 
62. 
63 








* Directions I. First read carefully through every item mentioned below and under- 
line everything which has ever troubled or worried you. 


** Directions II. Now go through each item underlined and if it is still worrying 
you, draw a ring around its number. 


basis from a wide array of existing “‘personality’” questionnaires. Where 
two items were not mutually exclusive and tended to cover a particular 
symptom, the better, in his judgment, was retained. 

Of all the sixty-three items only seven proved to be unproductive 
by showing a total yield of only 3% or less. These items are Nos. 3, 4, 
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7, 8, 20, 27, and 44. The last indicated that claustraphobia is a fairly 
infrequent symptom compared with No. 47, which involves “looking 
down from a height.” For use with a normal and representative group 
of individuals the seven items could well be omitted. 

The remaining fifty-six questions were arbitrarily distributed by the 
writer under the headings shown in Table 2. They are arranged in 
descending order of frequency of occurrence. 


Table 2 
Grouping of Items in the Worry Inventory by Categories 











Average Percentage 
No. of Ever Still 
Type of Difficulty Items Worried Worried Item Numbers 

Sex 3 31 9 25, 30, 48 

Inferiority 21 28 5 2, 12, 16, 18, 22, 
28, 29, 31, 34, 35, 
36, 37, 38, 40, 41, 
46, 51, 53, 54, 55, 
63 

Physical Symptoms: 

Conversion 10 16 8 i, 9, 10, 13, 14, 

15, 23, 44, 47, 56 

Schizoid Trends 16 18 5 5, 6, 11, 17, 24, 
26, 32, 33, 45, 48, 
49, 50, 52, 58, 59, 
60 

Religious Difficulties 3 12 5 39, 42, 57 

Fear and Anger or Cyclo- 

thymic Trends 3 14 3 21, 61, 62 
Total 56 21 5 





Psycho-analytic literature would probably technically label these as 
erotic, narcissistic, conversion or hypochondriac, repression, super-ego 
and accumulations of id symptoms and trends. On the other hand, the 
nomenclature of various factor analysts ‘ would use other specific terms 
such as “lack of confidence,”’ “solitariness”’ for the inferiority and schizoid 
trends. The remainder would hardly find a place. The chief concern 
of the investigation was to indicate certain practical advantages of the 
method of presentation and the method of marking the items. 

The results presented exhibit a decidedly different aspect to those 
obtained from the general run of such inventories, both in the matter of 
items, the item groups and the extent of the prevalence of such diffi- 
culties. This inventory, therefore, tends to shed a different light on the 


* Gibb, C. A., op. cit. 
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nature of nervous worries and problems to that of the ordinary person- 
ality questionnaire. It cannot, of course, be indicated to what extent 
the disclosures of such worries was due to the anonymity covering the 
individuals who answered the inventory, but at least one-third of the 
group voluntarily subscribed their names. Further the writer was 
brought into contact with the groups concerned as lecturer, so that a 
certain degree of rapport possibly existed. 

Do these average frequencies really indicate the relative importance 
of the groups of items? One cannot answer “‘yes”’ directly to this ques- 
tion. In the first place the number of questions relating to ‘‘Inferiority”’ 
amounted to 21 or almost 40 per cent of the total of the items, yet the 
group ranks second in average importance to that of sex difficulties which 
is represented by only three items. 

But seven items of the inferiority group approach or show a greater 
percentage of yields than the most heavily weighted item, No. 30 of the 
Sex Difficulties group. There is, hence, no conclusive evidence here 
supporting either the exclusively Freudian or the purely Adlerian theories 
of personality. The individual reader must interpret the results in his 
own way. It can be stated, however, that the present inventory is 
possibly more “revealing” in many ways than the Thurstone Ques- 
tionnaire. 

Later on, both the Worry Inventory and the Thurstone Inventory 
were administered to a group of 36 students of both sexes, for the sake 
of comparison. The following correlations were obtained: 


Thurstone Inventory and Worry Inventory Underlining 
Thurstone and Worry Inventory Ringing 
Worry Inventory Underlining and Worry Inventory Ringing 


To supplement this a test of extraversion was administered to the 
same group within a week of giving the others. The results were: 


Extraversion and Thurstone 
Extraversion and Worry Inventory Underlining 
Extraversion and Worry Inventory Ringing 


Thus, there appears to be a remarkably consistent constant factor 
emerging from all these tests as revealed by these correlation results 
which is probably that of intro-extraversion. While its symptoms are 
legion, one underlying cause, that of ego-centricity, appears as the prime 
cause of the human difficulties. 


Summary 


1. In the present ‘Worry Inventory”’ a list of sixty-three items was 
presented to a group of 100 university students of both sexes ranging 
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in age from 17 to 45 years. The inventory is intended to be representa- 
tive rather than exhaustive. 

2. The method of administration of the Inventory proved simple and 
far less difficult to answer than the usual list of personality questions. 

3. Its use tended, possibly under a cloak of anonymity, towards a 
higher yield of nervous difficulties and a wider field of symptoms, than 
is usually uncovered by the usual questionnaire method. However, some 
fair proportion of the excellent results is probably due to the direct 
approach used in the present Inventory. 

4. The Worry Inventory correlated to a very high degree with the 
Thurstone Questionnaire. 


Received November 29, 1943. 








Social Factors Annoying to Children 


Rose Zeligs 
Avondale Public School, Cincinnati, Ohio 


The present world upheaval adds to the bombardment of disturb- 
ances and annoyances that affect children and contribute to instability 
and lack of poise. A knowledge of annoying situations in children’s daily 
experiences should enable parents and teachers to eliminate many of the 
disturbing factors and to help the children adjust to annoyances that are 
unavoidable. 

This paper will discuss children’s reactions to annoyances they often 
experience. The subjects comprised 145 sixth-grade boys and 140 girls 
from two suburban elementary public schools of Cincinnati. Ninety- 
nine per cent of the children were native born, 70 per cent Jewish, 27 
per cent were non-Jewish, and 3 per cent were colored. Their average 
chronological age was 11 years and 8 months, and their average mental 
age, according to the Otis Group Intelligence Scale, Advanced Examina- 
tion, Form B, was 14 years. The socio-economic background was a little 
below very high on the Sims Score Card. 

These 285 sixth-grade children were asked to list all things which 
annoy, irritate, and bother them. The items were classified under social 
relationships, health and appearance, home and family, school, hobbies 
and interests, foods, personal conduct, games and amusements, fears, 
inconveniences and annoyances, animals and insects, and environmental 
conditions. 

The following year the material was presented to the 285 sixth-grade 
children of the same schools, described above. The material, in the form 
of separate tests for boys and girls, provided for five different degrees of 
feeling toward every item listed. These were like, don’t mind, don’t like, 
hate, and hate much. The children were asked to encircle the expression 
that best described their feeling toward each item listed. The material 
was tabulated, changed to per cents, and arranged according to frequency 
for hate much. 

This paper is a report of the items classified under social relationships, 
home and family, school, and personal conduct. 


Attitudes of Boys 


Table 1 lists social situations that are annoying (don’t like, hate, and 
hate much) to 65 per cent of the boys studied. They are especially 
75 
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Table 1 
Attitudes of 145 Boys Toward Social Situations 
Per Cent Per Cent 
Who Marked Who Marked 
Item “Hate Much” Item “Hate Much” 

Be blamed for something I To be slapped 34 

didn’t do 59 To be ridiculed 33 
People who cheat 59 Sarcastic remarks 33 
Unfair things 59 Two against one 28 
A bully 56 People who talk too much 28 
War 56 Ugly people 26 
Have people angry at me 51 See girls act grown up 26 
To lose someone’s things 46 See people crying 26 
To be poor 45 Hear talk on self-control 24 
People who show off 44 To borrow money 22 
A sissy 44 Old maids 22 
People who tell lies 43 People who talk too fast 22 
People who laugh when I People who talk too slowly 21 

get hurt 43 To fight 19 
People who always argue 40 Long religious sermons 18 
Stupid people 40 To be alone 18 
To lose in a fight 37 People on diets 14 
To fight little boys 37 To talk on telephone 13 
A mean person 37 Certain boys 13 
To get into trouble 37 Certain girls 13 
Tattletales 35 People who act funny 12 





The average distribution for all items, in terms of per cent, was: Like 16, Don’t 
mind 19, Don’t like 21, Hate 17, and Hate much 27. 


disturbed when blamed for something they did not do and by people who 
cheat, unfair things, a bully, and war. They do not like to have people 
angry at them, to lose someone’s things, or to be poor. People who talk 
too much, who lie or argue, who are stupid, mean or sarcastic, who laugh 
when others get hurt are very much disliked by the boys. Sissies, show- 
offs, or tattletales rate low with these boys. They do not like to lose 
in a fight or to fight with little boys, to get into trouble, to be ridiculed, 
or to be slapped. The boys vary in the number of situations that annoy 
them and in the degree to which they are annoyed.' 

Table 2 gives home and family situations that are annoying to many 
boys. They dislike most to get a whipping, especially for something they 
did not do, to be scolded, punished, or nagged. 

Some of the boys do not like to do various chores around the house, 
run errands, or take care of smaller siblings. 

1 Rose Zeligs, The relationship of emotional and personaity traits to learning in 
children (unpublished Doctor’s dissertation, University of Cincinnati, 1937). 
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Table 2 
Attitudes of 145 Boys Toward Home and Family 
| Per Cent Per Cent 
Who Marked Who Marked 
Item “Hate Much” Item “Hate Much” 

To get a whipping for some- To wash and dry dishes 13 

thing I didn’t do 62 To sweep floors 13 
To get a whipping 50 To wash the bathtub 13 
To be scolded 42 To work around the house 13 
Let brother break my games 36 To go to bed late 12 
To be punished 36 To dust furniture 12 
To be nagged 35 To set the table 12 
To clean the cellar 23 To fight with brother 10 
To sew 21 To clean gold-fish bowl 10 
To wait in store while mother To cook 10 

shops 20 To go to the store 10 
My sister’s singing 18 To take care of house 9 
To go to bed early 16 To shovel snow 9 
To serub 16 To go to bed 9 
To clean the house 14 To rake the yard 8 
To take care of cousin 13 To run errands 6 





The average distribution for all items, in terms of per cent, was: Like 23, Don’t 
mind 33, Don’t like 16, Hate 11, and Hate much 17. 














" Table 3 
Attitudes of 145 Boys Toward Schoo! Situations 
0 Per Cent Per Cent 
Who Marked Who Marked 
: Item “Hate Much” Item “Hate Much” 
h To get low marks 49 To go to Sunday School : 13 
“ Certain teachers 40 To do nightwork 13 
‘ Teachers who have pets 36 To have to miss school 12 
Be sent to principal 36 To write book reports 11 
, To hear teachers preach 30 To study grammar 11 
y Teachers who dislike me 30 To go to school 9 
Writing lessons 23 To give talks . 9 
y To memorize poems 21 To study history 8 
| High singing 18 To read my compositions 7 
y To write compositions 17 To take art 7 
Singing in school 14 To take tests 6 
2) Arithmetic 13 To study geography 4 
in The average distribution for all items, in terms of per cent, was: Like 34, Don’t 


mind 27, Don’t like 14, Hate 9, and Hate much 16. 
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Though many school contacts are enjoyed by the boys, it can be seen 
from Table 3 that the boys dislike most to get low marks, certain teachers, 
teachers who have pets, who preach, or who do not like them, and to be 
sent to the principal. Almost 20 per cent of the boys hate or hate much 
to go to school while 40 per cent of them have that same reaction towards 
Sunday School. Writing and singing lessons represent the least-liked 
subjects, while manual training and gymnasium work are the favorite 
subjects. 

Table 4 lists: the boys’ attitudes toward personal conduct. Heading 
the list are character and personality qualities. To curse, tell lies, or be 


Table 4 
Attitudes of 145 Boys Toward Personal Conduct 








Per Cent Per Cent 

Who Marked Who Marked 

Item “Hate Much” Item “Hate Much” 
To curse 57 To break windows 38 
To tell lies 52 To brag 38 
To be stupid 51 To get angry 32 
To be accused of lying 51 To lose in a ball game 30 
A guilty conscience 47 To lose in any game 28 
To lose money 46 To hurt someone 28 
To cry 45 To lose in card games 26 
To lose a fountain pen 45 To lose in marbles 25 
To bite my nails 41 To attract attention 18 
To lose anything 41 To kiss people 15 
To have bad habits 38 To be still 12 





The average distribution for all items, in terms of per cent, was: Like 4, Don’t 
mind 13, Don’t like 25, Hate 22, and Hate much 36. 


accused of lying and to have a guilty conscience or bad habits are greatly 
disliked. Other annoying situations involve the loss of money, fountain 
pen, or other possessions, or to lose in games. 


Attitudes of Girls 


Girls express annoyance to a greater number of situations and to a 
greater degree than do the boys. Table 5 contains many more items 
showing girls’ dislikes in social situations than are found in Table 1 for 
boys. Most disturbing to the girls is to witness pain and suffering, people 
getting killed, sickness and death. Social relationships that result in 
lowering their dignity or personal status are very annoying to the girls. 
These include being accused of something they did not do, or of lying, 
being called a cheater, made fun of, treated like a baby, called names, 
scolded, or called a poor sport. 
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Table 5 
Attitude of 140 Girls Toward Social Situations 
Per Cent Per Cent 
Who Marked Who Marked 
Item “Hate Much” Item “Hate Much” 
See people getting killed 68 A stale joke 36 
Be accused of something I Grouchy children 36 
didn’t do 68 Selfish people 36 
Be called a cheater 61 Be bothered when reading 35 
See people suffer 56 Quarrel with boy friend 35 
Be treated like a baby 56 Children acting babyish 33 
Be made fun of 55 When friends act foolish 33 
Hear someone has died 53 People who talk too much 33 
Be called names 51 People who don’t like to do 
Be scolded 50 anything 33 
Be bothered or pestered 50 A dull party 32 
i A bully 48 People who gossip 32 
, People who are fakers 48 To lose in a game 32 
- To see someone sick 48 Thoughtless people 32 
To be called a poor sport 48 Play with someone I don’t 
To be accused of lying 48 like 31 
Associate with dirty people 47 Not to go to parties 31 
People who lie 47 People who talk in movies 31 
Nasty people 47 To be teased 31 
People minding my business 46 Visits to boring people 31 
Jealous people 46 Be teased about boy friend 28 
A dry speech 45 Discourteous children 27 
To be embarrassed 45 Silly people 27 
To see a child whipped 45 To fight 26 
Sissies 4a To get angry at someone 25 
— Children who cheat 43 To be patronized 24 
t An untrue friend 43 Smart person acting dumb 23 
To be insulted 43 People who are cross 23 
To play with mean children 43 Not to go to dancing school 23 
ly Constant nagging 43 People who are rough 23 
in See people biting their nails 43 Persisting people 22 
Conceited people 42 To be disappointed 22 
Stubborn people 40 Kissing games at parties 21 
Have privileges taken away 40 To pay full car fare 22 
People who are pests 40 To visit sick adults 20 
8 People who brag 40 Talk about self control 20 
ns When friends get sick 40 Boys who make love to me 19 
‘or See a crippled person 39 Talk long on one subject 19 
dle See children show off 39 To take revenge 18 
: Sloppy people 38 People who act funny 17 
= A hairy actor 38 Have big responsibilities 16 
ls. Stupid people 37 To talk long on phone 15 
ng, To be yelled at 36 
es, 








The average distribution for all items, in terms of per cent, was: Like 14, Don’t 
mind 14, Don’t like 20, Hate 22, Hate much 30. 
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Most of the girls express extreme dislike for antisocial people such 
as those who lie, mind other people’s business, are jealous and nasty. 
They don’t like sissies, children who cheat, and an untrue friend. Per- 
sonality traits annoying to the girls are conceit, stubbornness, sloppiness, 
grouchiness, foolishness, selfishness, stupidity, and thoughtlessness. 
Constant nagging and teasing, visits from boring people, and individuals 
who bite their finger nails, talk in movies, and gossip are disliked. In 
general, the girls reflect very definite standards of social conduct and an 
unfavorable attitude toward those who do not live up to those standards. 

Also, in their attitudes toward home and family the girls reflect 
standards of family relationships. As seen in Table 6, practically all the 


Table 6 
Attitudes of 140 Girls Toward Home and Family 








Per Cent Per Cent 

Who Marked Who Marked 

Item “Hate Much” Item “Hate Much” 
Make parents unhappy 54 Not to see relatives 21 
A spanking 52 To wash and iron clothes 20 
To irritate my mother 50 To be bothered by sister 20 
When mother is angry 41 To scrub 16 
To be punished 41 To wash stockings 12 
To be scolded 37 To wash dishes 12 
To stay home on Sunday 36 To dust furniture 11 
A dirty yard 36 To make beds 11 
To see little brother cry 31 My sister’s singing 10 
To have my things bothered 28 To sweep 9 
Not to help mother 24 To darn socks 9 
When mother won’t buy To go to bed late 9 
things I want 23 To clean the house 7 

To be bothered by brother 22 





The average distribution for all items, in terms of per cent, was: Like 26, Don’t 
mind 24, Don’t like 18, Hate 14, and Hate much 18. 


girls dislike to make their parents unhappy or angry, to be punished or 
scolded. Though they do not like to be pestered by siblings, most of the 
girls like their little sisters and brothers and do not mind taking care of 
them. Practically all the girls like babies. Some of them dislike house- 
work, but very few girls dislike company or going out on Sunday. 

As shown in Table 7, maladjustment in school is displeasing to the 
girls. They are extremely disturbed by bad report marks, unsatisfactory 
lessons, or poor conduct in school, especially if these require their mothers 
to come to school. They do not like teachers who get angry, scold, or 
give long lectures, or fail to mark papers. Writing lessons are liked least 
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Table 7 
Attitudes of 140 Girls Toward School Situations 
Per Cent Per Cent 
Who Marked Who Marked 
Item “Hate Much” Item “Hate Much” 
Bad report marks 58 Writing lessons 21 
When mother has to come To have nightwork to do 21 
to school 40 Teachers who don’t mark 
Not to know my lesson 39 papers 19 
To be bad in school 36 To discuss poems 18 
Not to have my homework 34 To copy things over 18 
To make teacher angry 33 To memorize poems 18 
When I can’t answera question 33 To make book reports 17 
To get nervous in a test 31 To go to Sunday School 13 
Too much nightwork 30 To carry books 13 
To do hard lessons 27 To study 13 
Long lectures by teachers 25 To have tests 11 
To talk out in school 24 Certain teachers il 
Teachers who scold 23 To have school over 10 
To memorize lessons 22 To write compositions 9 
To stay home from school 22 Grammar 6 





The average distribution for all items, in terms of per cent, was: Like 26, Don’t 
mind 23, Don’t like 17, Hate 16, and Hate much 19. 


of al! the subjects, while gymnasium work, spelling, drawing, and geog- 
raphy are liked most. 

Antisocial character and personality traits (Table 8), as expressed in 
their personal conduct, are annoying to the girls. Telling lies, being 
bad, mean, or unreliable, doing wrong, or getting into trouble are marked 


Table 8 
Attitudes of 140 Girls Toward Personal Conduct 








Per Cent Per Cent 
Who Marked Who Marked 
Item “Hate Much” Item “Hate Much” 
To have bad habits 58 To make plans and drop 
To tell lies 55 them 34 
To bite fingernails 53 When I do wrong 31 
To be bad 51 To forget things 28 
To be mean 51 To sit in perfect silence 23 
To be unreliable 51 A self-conscious feeling 21 
To get in trouble 42 When I can’t do as I like 19 
To be idle or lazy 37 Always to want things 16 
To do something I don’t like 36 To sit still 14 





The average distribution for all items, in terms of per cent, was: Like 11, Don’t 
mind 12, Don’t like 21, Hate 23, and Hate much 33. 
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hate or hate much by most of the girls. Also having bad habits, biting 
finger nails, or being lazy and forgetful receive the disapproval of many 
girls. 
Sex Differences 

A comparison of the sexes in relation to annoyances on 102 items 
listed by both groups indicates that girls are annoyed by more factors 
than boys and also that they react to a greater degree than the boys do 
to those annoyances. This is especially true in relation to personal 
conduct. Significant sex differences show greater dislike by boys of 
people who cheat. They are much more resentful than the girls of not 
being able to do as they like. In school, significantly more boys dislike 
arithmetic and singing, while more girls do not like to have nightwork 
to do. More boys do not like to sew or to hear their sisters sing. Sig- 
nificantly more girls hate to bite their finger nails. 


Summary 


A study of children’s reactions to annoyances occurring in social 
relationships, home and family, school, and personal conduct indicates 
that many situations are extremely annoying to twelve-year-old children. 
Girls are more often and more extremely annoyed than boys. 

Educational Implications. An annoyance is an unfavorable attitude 
which in many cases is caught from others. Some of the items listed 
should not be annoying to the normal child. People who are constantly 
irritated by every little thing reflect a failure to adjust to their environ- 
ment. “They can’t take it.’”’ On the other hand, a glance at many of 
the items listed in the tables suggests definite social standards which 
have been accepted by the children. Being annoyed by antisocial con- 
duct reveals the acceptance of favorable group attitudes and leads to 
adaptation to the social group. But when children respond unfavorably 
to many unimportant situations, they should be taught to ignore those 
factors and not be disturbed by them. However, we should avoid as 
much as possible constant and unnecessary irritation. This is vital in a 
time when the human organism is being worn out by many irritations, 
fears, and worries. 


Received January 20, 1944. 






































A School Survey of Eye-Hand Dominance 


Gertrude Hildreth 
Horace Mann-Lincoln School, Columbia University 





The belief is widespread that mixed eye-hand dominance, that is, the tendency for 
eye and hand preference to be opposite sided, is a cause of reading disability, nervous- 
ness and behavior problems. The right-handed child who is left-eyed or vice versa is 
assumed to have more difficulty in learning to read and in making satisfactory personal 
adjustments. He is believed to mirror write more than the individual whose eye and 
hand dominance are like-sided. A parent recently inquired, ‘‘Isn’t my child’s reading 
difficulty due to the fact that tests show she is left-eyed?” The notion is also prevalent 
that mixed eye-hand dominance is a rare occurrence in the general population. 

A number of published studies have given somewhat conflicting results. The present 
study was undertaken to obtain additional evidence on the subject, to determine the 
incidence of mixed dominance in an entire elementary school population and the asso- 
ciation between mixed dominance and reading disability in the same population. 


Results of Previous Studies 


Incidence of Left-Eyedness and Mixed .Dominance. Parson (8) re- 
ported 30 per cent left-eyedness, 12 per cent impartial, and 58 per cent 
right-eyedness in a normal school population in which left-handedness 
was present in about four per cent of the cases. Selzer (10) found 32 
per cent of 96 children in the Cambridge schools, Grades II to VI, to be 
left-eyed ; 72 per cent of the left-handed children were left-eyed. Schonell 
(9) investigated the relation between eye and hand dominance in 75 
school children, with the following results (R.H. and R.E. signify right 
hand and right eye respectively): R.H.—R.E., 60%; R.H.—L.E., 25%; 
L.H.—L.E., 4%; L.H.—R.E., 3%; R.H.—and either eye, 8%; and 
L.H.—and either eye, 0. 

Mixed Dominance and Reading Disability. Johnson O’Connor (7) 
reported an unusual amount of uncoordinated motion, twitching, shifting 
of position, speech irregularities and reading troubles among those who 
tested left-eyed and right-handed. He found less trouble among those 
whose handedness and eyedness agreed. A study by Mintz (5) indicated 
more mixed dominance among subnormal than among normal children. 
Monroe (6) reported that 43 per cent of the reading disability cases she 
studied showed mixed dominance. Fernald (1) reports on the problem 
as follows: ““The right-handed cases and the cases of matched eye-hand 
dominance resemble the cases in which the dominance is not matched, 
are as serious in their deficiency, learn by the same methods, and are as 
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successful in the final outcome. -——-— The subject with unmatched eye 
and hand dominance learns to read and is able to read in an entirely 
normal manner with eye and hand dominance still opposite.” Gates 
and Bond (2) found little association between eyedness, handedness and 
reading achievement. Johnson (4) reported no significant relationship 
between lateral dominance and reading disability. Wile (11) found 
among children showing reading disability, 62 per cent left-eyed, 8 per 
cent uncertain or mixed, 30 per cent right dominant. There is no indi- 
cation as to how many of these pupils were left-handed. Witty and 
Kopel (12) found 33 per cent left-eyed in a group of poor readers, 31 
per cent left-eyed in the normal reading, non-problem group. Wolfe (13) 
reported that more normal readers than retarded showed dominant left- 
eyedness on a sighting test and the groups were found not to be dis- 
tinguished by handedness or a combination of eye-hand dominance. 
However, the retarded proved to be inferior in auditory functions, visual 
perception and emotional adjustment. 


A School Survey 


New evidence on the question was obtained through surveying an 
elementary school population consisting of 101 boys and 90 girls from 
kindergarten through Grade VI. The age range was from 6 to 11 with 
two children who lacked a month of being 6 and two who had just turned 
12. The population rated above the average in general ability as meas- 
ured by standard tests. Because of the selected nature of the population 
there were no cases in which reading disability could be attributed to 
lack of learning capacity. 

Tests of Eye and Hand Dominance. Each pupil was given individual 
tests of eye and hand dominance. Four tests of eye dominance were 
used as follows: 


1. Sighting dot. The child’s attention was called to a black dot on a card 
tacked on the wall. He was told to stand with toes on a line marked on the 
floor three feet from the wall. The dot was placed at eye level. The child 
was given a box lid in the center of which was a half inch hole. He was asked 
to hold the hole at arms’ length with both hands, he was told that one eye 
would be covered, then the other and that the dot would probably disappear 
when one eye was covered. The examiner covered each eye in turn, with a 
small card, asking the child to state whether he could or could not still see the 
dot. The eye with which he could still see the dot when one eye was covered 
was wintotal as the dominant eye. 


2. Sighting with the Parson manoptoscope. The child was asked to stand 
on a line two feet from the wall where the Parson board with red ball in be- 
tween the letters L and R was hung up at the right height for his size. He 
was given the aluminum cone, was instructed to hold it in both hands, his 
attention was called to the red bail and he was instructed to fixate the red ball 
as he looked through the cone. Then the examiner explained that one letter 
could be seen alongside the ball, and the child was asked to state what letter 
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he could see. This letter was recorded. For the youngest children, small 
pictures of a dog and cat were used in place of the letters L and R. 

3. Sighting dot with the Parson cone. The child standing two feet from a 
card with a black dot held by the examiner, was asked to fixate the dot through 
the aluminum cone. The position of the ‘aperture of the cone with reference 
to an imaginary vertical line drawn through the child’s nose was recorded. 
If to the night of th of the imaginary line, the child was recorded as right-eyed; if 
to the left, left-eyed 

4. Peep Show. The child was shown an imitation Easter egg with pictures 
inside that could be seen by looking throu ugh a quarter inch aperture at the 
smaller end of the egg. The examiner held the egg in both hands while the 
child studied and reported on the picture. The eye used in fixating the pic- 
ture was recorded as the dominant eye. 

A test was repeated if for any reason results were not clear cut. In the 
first three, the child kept both eyes open. 

Only the fourth test in which the examiner rather than the child held the 
materials could be said to be entirely independent of handedness influence. 

Determination of dominant ey sdhdie was more difficult and probably less 
reliable in the case of younger children because of their difficulty in complying 
with instructions and their concern with irrelevant features of the test situation. 


Tests of Handedness. Handedness was tested in three ways: 


1. The child was requested to pick up a pair of scissors and to cut a corner 
from a piece of paper. 
2. He was asked to write his name on the paper. 
3. He was asked to pick up and toss a ball to the examiner several times. 
These three tests afford a sampling of single handedness activities rather 
than a complete survey of each pupil’s handedness habits. 


Table 1 
Results of Eye and Hand Dominance Tests in an Elementary School Population 














Eyedness Handedness wii 
Oo 

AgeLevel 4R 4L 3R 3L 2R-2L 3R 3L 2R = 2L Cases 
5-6 Years 
Number 10 & 2 2 2 20 2 1 1 24 
Per Cent 42 33 8.3 8.3 8.3 83.3 83 42 4.2 
7 Years 
Number 13 6 3 2 2 22 1 1 2 26 
Per Cent 50 23 11.5 7.7 7.7 85 388 38 7.7 
8 Years 
Number 19 10 1 2 — 26 3 2 1 32 
Per Cent 59 31 3.1 6.2 _ 81 94 62 3.1 
9 Years 
Number 14 9 a 1 3 26 1 —_ — 27 
Per Cent 52 33} — 3.7 11.1 %3 37 —- — 
10 Years 
Number 27 8 4 2 1 34 4 — 4 42 
Per Cent 64 19 9.5 4.8 2.4 81 95 — 95 
11 Years 
Number 12 15 3 5 5 36 2 1 1 40 


Per Cent 30 375 75 125 12.5 90 56.0 2.5 2.5 
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Indications of Reading Achievement. Objective test records, for the most 

art Stanford Achievement reading scores, or comparable data, were available 
or each pupil. Those who scored a year or more below grade median in the 
primary grades, a year and a half or more in the upper grades were considered 
disability cases for purposes of this study. Teachers’ estimates and the neces- 
sity for recent remedial instruction in reading served as additional criteria in 
selecting the poorest readers in each class. 


Results 


Table 1 shows the number and percentage of pupils in each age level 
who were totally or partially right- or left-eyed, totally or partially 
right- or left-handed on the seven tests given. The right-eyed children 


Table 2 


Results of Eye and Hand Dominance Tests showing the Relationship between 
Eye and Hand Dominance 











Mized Eye-Hand Dominance 
Age Level R.H.-L.E. L.H.-R.E. R.H.-R.E. L.H.-L.E. 
5-6 Years 
Number 7 1 11 2 
Total number mixed 8 
Total per cent mixed 33.3 
7 Years 
Number 8 3 14 — 
Total number mixed 11 
Total per cent mixed 42 
8 Years 
Number 8 1 20 2 
Total number mixed 9 
Total per cent mixed 28 
9 Years 
Number 9 — 14 1 
Total number mixed 9 
Total per cent mixed 33.3 
10 Years 
Number 7 3 27 4 
Total number mixed 10 
Total per cent mixed 24 
11 Years 
Number 20 3 12 — 
Total number mixed 23 
Total per cent mixed 57.5 
Per cent mixed all age 
levels combined: 30.6 
The record for the only pair of twins: 
Eyedness Tests Handedness Tests 
om 2.4 i. 2A 
Girl R RRR R RR 
Boy L LULL R RR 
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were more consistently right-eyed in the four eyedness tests than were 
the left-eyed children. It was rather surprising to find a substantial 
number of the right-eyed children writing with the left hand. 

There was no clear cut or consistent developmental tendency from 
one age group to the next. See Table 2. Although numerous research 
studies have shown a decrease in dominant left-handedness with age, 
and a slight tendency toward decreasing left-handedness was found in 
these results, no such decrease was shown in these data for eye-dominance. 





Table 3 
Eye-Hand Dominance Data for the Slowest Readers in Each Age Group 














Boys Girls 
Age Level Eyedness Handedness Eyedness Handedness 
7years Casel 4R 3R _ _- 
Case 2 4L 3R — — 
Syears Casel 4R 3L Case 1 4L 3R (foreign 
child) 
Case 2 4R 3R Case 2 4R 3R 
Case 3 4L 3R 
Qyears Casel 4L ~~ 8R Case 1 4R 3R 
Case2 4R 3R 
Case3 3L 3R 
10years Casel 4L 3L Case 1 4R 3R 
Case 2 4L 3R 
Case3 4R 3R 
Case 4 4R 3R 
Case5 3L-1R 3R 
llyears Casel 3R-1L 3R Case 1 2R-2L 3R 
Case 2 4R 3R 
Case 3 4R 3R 
Case4 3R-1L 3L 





In fact, the eleven-year group showed an increase in left-eyedness over 
the ten-year group and more instability on the tests in general. The 
same situation did not prevail in the handedness test results for the 
eleven-year group. This may be a chance result due to the small popu- 
lation or it may be an indication of instability appearing with the ap- 
proach of puberty. 

Eye-Hand Dominance and Reading Disability. Twenty-two cases, 
sixteen boys and six girls in the entire group of 191, were selected as 
reading disability cases, according to the criteria given above. These 
pupils were all seven years of age or older. Table 3 shows the results 
for each age level in terms of eye-hand dominance. 
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Of the boys, there were seven or 44 per cent of the cases who showed 
mixed eye-hand dominance; of the girls, three or 50 per cent who showed 
this condition, a total of 45 per cent or less than half of all reading dis- 
ability cases. The number of cases is too small to indicate reliably that 
these reading disability cases showed more tendency toward mixed domi- 
nance than the normally successful readers. 

Although a somewhat larger proportion of mixed dominant cases had 
difficulty with reading than consistent dominant pupils, since fewer than 
half the mixed dominants were slow in learning to read, the conclusion 
must be drawn that mixed dominance is not a prevailing condition in 
reading disability, far less a dominant causal factor in the majority of 
disability cases. 


Received December 10, 1948. 
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Hahn, Eugene F. Stuttering, significant theories and therapies. Palo 
Alto: Stanford University Press, 1943. Pp. ix +177. $2.00. 


Professor Hahn has presented a compendium of theories and therapies 
of 25 of the “‘authorities in the field.” He does not state the basis for 
his selection of specialists, although the list appears to be representative 
of the major educational and medical clinics doing research and thera- 
peutic work on stuttering in this country and abroad, and to suggest the 
great variety of ways in which the problem is studied and treated. Since 
the several reports average less than six pages each, they are necessarily 
sketchy and incomplete. The author’s purpose of facilitating compari- 
sons of the philosophies and therapeutic procedures involved is inherently 
limited by this restriction. The careful student will want to go to origi- 
nal sources. The beginner may be more confused than enlightened by 
the multiplicity of points of view presented. 

Each summary is divided into sections on Theory and Therapy. 
While most specialists appear to have been impelled to evolve a unique 
theory of stuttering, the careful reader will detect a considerable amount 
of similarity, under the verbiage, used to explain many of the theories. 
Some are certainly much more closely in line with modern psychological 
thinking than others. One of the values of the theory is no doubt the 
suggestion to the patient that the clinician is an expert. Anyone who 
has such a plausible explanation of the difficulty would surely appeal to the 
afflicted as a desirable person to administer treatment or direct his study. 
There is no indication that clinicians have consciously evolved their 
theories for this purpose. The multiplicity of theories is suggestive of 
the early reports to The French Academy of Sciences on the origin of 
language. 

The Sections on Therapy appear to this reviewer as more significant. 
While few clinicians claim universal ‘‘cures” the success reported with 
such widely divergent treatments indicates either that stuttering is a 
malady of many causes and therefore subject to many treatments, or 
that there are common therapeutic values often overlooked in the various 
types of therapy, and not explained as such in the theory. 

An Appendix of fifteen pages prepared by the author and entitled 
“Procedures in a Clinic for Stutterers’’ is a gesture toward unification of 
the various points-of-view of the “specialists.”” Within the limits of 
space used, this appears to be one of the better features of the book. 
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The book focuses attention on the great variety of backgrounds and 
approaches to the malady by those who are “authorities in the field,” 
and leaves the reader with the realization that there is still much research 
to be done. If this book serves as a stimulus to further research and 
unification of thinking in the field, it will make an important contribution. 
This reviewer does not believe that the book will serve as such a stimulus. 


Franklin H. Knower 
The State University of Iowa, 


Department of Speech 


Cole, Luella. Attaining maturity. New York: Farrar and Rinehart, 
1944. Pp. x + 212. $2.00. 


This treatise is concerned with problems of adjustment in the modern 
world. Both in ordinary times and in times of stress there are always 
individuals who cannot adjust to complexities of life. Some of these 
never progress beyond childish attitudes and adolescent whims. Others 
regress to this status in trying times. In other words, such people never 
grow up, i.e., attain maturity. The mature individual is relatively free, 
he finds contentment and he feels secure. After noting the need for 
maturity, the author discusses the criteria of intellectual, emotional, 
social, and moral maturity. Popular escape is by fantasy, play, solitude, 
fanaticism, projection, sophistication and illness. Solutions for the ma- 
ture person are briefly outlined. The final section is concerned with 
maturity and the war. The book contains many enlightening examples. 
Both the case studies and the discussion are well organized to aid in an 
analysis of acute problems of adjustment. As a guidebook for those 
seeking to attain maturity, however, the treatise is less adequate. There 
are a few ill-considered statements such as “the madhouse yawns for 
those who cannot be emotionally toughened” in war situations. Never- 
theless the book has much to recommend it as “a guide to living with 


yourself and other people.” 


Miles A. Tinker 
The University of Minnesota 


Kingsley, Donald J. [Chairman]. Recruiting applicants for the public 
service. A report submitted to the Civil Service Assembly by the Com- 
mittee on Recruiting Applicants for the Public Service. Chicago: Civil 
Service Assembly of the U. 8S. and Canada, 1942. Pp. xvi + 200. 
$3.00. 


This volume sponsored by the Civil Service Assembly of the U. 8. 
and Canada is another in the series dealing with the major phases of 
public personnel administration. The report is the work of J. Donald 
Kingsley aided by a committee consisting of Fred Zapolo, Russel Barthell, 
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Irving Gold, George C. Brown, William Howell, George D. Halsey, and 
Edgar B. Young, among others. 

The treatment of the subject matter can be subsumed under four 
categories or topics embracing six chapters: (1) historical antecedents of 
public service recruitment, (2) the nature of recruitment and recruitment 
procedures, (3) determining and forecasting personnel needs, and (4) 
application procedure and audit. 

The content of the report is summarized as follows: Traditional con- 
cepts of recruitment are inadequate and have generally failed to produce 
a meritorious public service. The assumption that “keeping the rascals 
out’’—a reaction to the spoils system—would of itself establish an ade- 
quate public service has been shown to be incorrect, and in fostering a 
laissez faire attitude in regard to recruitment, has resulted in a failure to 
attract men of superior ability to the public service. Recruitment should 
be positive in approach and should be to the lower levels of integrated 
classes of positions with promotion to higher levels. Such recruitment 
should occur at an early age and the point of entry into the service should 
: be related to the educational system. The institution of these prin- 
ciples coupled with flexible transfer and promotion would give rise to a 
) career system attractive to men of ability. Assumptions of such a career 
system are that adequate techniques for attracting and selecting recruits 
are available, and that high prestige value exists for the public service. 

Recruitment is defined as “‘that process through which suitable candi- 
dates are induced to compete for appointments to the public service.” 
The first step in the process is the determination of present and future 
personnel needs as a basis for planning the recruitment schedule. Such 
forecasting reduces the need for provisional appointments and serves to 
coordinate the recruitment, selection, and certification programs. Per- 
sonnel needs may be determined from many sources but among the most 
useful are departmental demands for new personnel, personnel records 
and turnover statistics, analysis of departmental budgets, and surveys of 
employees for promotional possibilities. 

There are two kinds of recruitment: anticipatory and direct. The 
former consists of building up favorable attitudes toward the public 
service without regard to any particular examination but with an eye to 
remote recruitment; the latter is the specific search and location of an 
adequate number of applicants for a specific examination. 

Examination announcements are employed in direct recruitment to 
attract and interest qualified persons in examinations, to inform them 
if of the nature and conditions of employment, and to discourage unqualified 
d persons from applying. However, examination announcements are not 
1, : equally suited for these functions and should only be employed for indi- 
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viduals already interested in public employment. Where legal require- 
ments are not restrictive, examination publicity should be of a variegated 
nature and in line with principles of modern advertising. Announce- 
ments should be made more attractive and more care given to their 
distribution. 

Admission to competition is invariably by application. This process 
involves the construction of an application form to secure the appropriate 
facts, administration of the application, determination whether minimum 
qualifications are met, and notification to the applicant of admission or 
rejection. The governing principle in application procedure is that only 
those items related to successful job performance, or statutory require- 
ments, should be contained in the application, and audited in a manner 
most conducive to the determination of qualification or the lack of it. 

Principal deficiencies of current recruitment procedures are: (1) un- 
imaginative and stereotyped recruitment, (2) tendency to conform to 
legal minima, (3) absence of objective information upon which to base a 
sound selection program, and (4) a lack of planning and basic research. 

The value of this report lies in its sound outline and treatment of the 
general problems of public service recruitment. It is not intended as a 
palliative indicating how more nurses, accountants, or social workers may 
be recruited in the war emergency but rather presents a number of prin- 
ciples upon which an adequate recruitment program should be based. 
This seems a wise departure for it is doubtful at best whether any volume 
of this scope can solve the problem of recruitment without a basic re- 
formulation of governmental and administrative policies. Although the 
report is by no means novel in its recommendations, following closely the 
earlier studies cf Professor Kingsley and the Commission of Inquiry on 
Public Service Personnel, its restatement is so fundamental as to make 
it mandatory reading. Nor does this detract from the excellent discus- 
sion of application procedure which comprises the second half of the book. 

Of specific merit is the stress placed upon the need for research and 
objective information regarding basic recruitment and selection prob- 
lems. The reiterated objections in the book to provisional or temporary 
appointments and the emphasis on forecasting of personnel needs as a 
corrective measure is wholesome. Of consequence also is the objection 
to long and involved application forms where the information requested 
has no relation to job success. Following industrial practice, applica- 
tions should contain only those items that are of predictive significance 
in placement, in addition of course to those required by statute. 

It is the reviewer’s opinion that the career system as a cure-all for 
the recruitment problems of governmental agencies has been overstressed. 
Granting for the moment the assumptions of prestige for the public 
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service and adequate selection and recruitment techniques—by no means 
safe assumptions—have not other factors been overlooked; e.g., compe- 
tition of industry, business, or the professions; opportunities for higher 
salaries or profits which public jurisdictions could not hope to meet; the 
fact that an unknown quantity of individuals may not be interested in 
a life career in one agency or locality. Attraction of individuals to 
employment is always based upon a complex of factors of which the 
opportunity for a career is a variable weighted in varying degrees. 

There is furthermore something to be said for recruitment of special- 
ists at higher than the entrance level. Public jurisdictions are limited, 
and will remain limited by their very nature, in the scope and amount of 
training they can provide career neophytes in a large and varied number 
of occupations. Such training can usually be given only in professional 
and technical schools at considerable expense. Selection of numbers of 
recruits on the basis of appropriate aptitude may well be an insurmount- 
able task for such a diversity of aptitude tests do not exist and public 
personnel aptitude testing is even less advanced than subject matter or 
achievement testing. Nor can a probationary period substitute for 
inadequate selection. The concept of a career system is laudable when 
due consideration is given related factors in recruitment. 


Arthur Burton 
California State Personnel Board 
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